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Same as Table XVII, except for five EPI's ------ 


Contingency tables and related statistics for 
both dependent and independent North Atlantic 
Ocean area 3W, 15 May~-15 July 1983, data, 

with linear-regression equations as predictors, 
from stage four of the developmental model 

using four EPI'S ----------- 22 e ee 


Same as Table XIX, except results are from 
stage two of the developmental model and 
predictors are divided into eight EPI's each --- 
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Linear-regression equations for the 
Peecmcredavaltie on the visibility category 

(y) for both regression methods, y 

statistics and threshold values from both 
threshold models, North Atlantic Ocean area 

3W, 15 May-15 July 1983 -------------------~-~--- 


Contingency tables and related statistics 

from linear regression method 1 (single 
equation), quadratic threshold model, for 

both dependent and independent North 

Atlantic Ocean area 3W, 15 May-15 July 

1983, data, with all predictors available 

to the regression model ----------------------- 


Same as Table XXII, except using the equal 
variance threshold model ---------------------- 


Contingency tables and related statistics 

from linear regression method 2 (decision- 
tree), quadratic threshold model, for both 
dependent and independent North Atlantic 

Ocean area 3W, 15 May-15 July 1983, data, 

with all predictors available to the 

regression model -------------- 2-2-2957 ------ 


Same as Table XXIV, except using the equal 
Vane sielalncdc. ——<—${$—{$3— — —— = — 


Contingency tables and related statistics 

from linear regression method 2 (decision- 
tree), quadratic threshold model, for both 
dependent and independent North Atlantic 

Ocean area 3W, 15 May-15 July 1983, data, with 
only those predictors identified as best by 

the Preisendorfer methodology available to 

the regression model -------------------------- 


Same as Table XXVI, except using the equal 
variance threshold model ---------------------- 


Summary of the contingency table statistics 


for all MOS variations used in the North 
Atlantic Ocean area 3W, 15 May-15 July 1983 --- 
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Homogeneous areas for the North Atlantic 
Ocean, June and July, from Lowe (1984 b) ---------- ee 


The behavior of contingency table statistics 

for dependent and independent data, as the 

number of EPI's is varied, for the North Atlantic 

Ocean area 3W, 15 May-15 July 1983, when predic- 

tors are chosen based upon the maximum increase 

Peat eacmocpendent Cata, for (a) a single 

Predictor, (b) two predictors, (c) three 

predictors, (d) four predictors, and (e) five 

Pace Cr OMS tee ee SS 134 


Same aS Fig. 2, except predictors, after the 

first, are selected by having the lowest RSS FD 

moma) two predictors, (b) three predictors, 

fe, LOURepredgdictors, (d) five predictors, and 

fe) SIX PECGICEOYS ~—— —— <3 — @< 8888 ~— ~~ - = -- = - leo 


The behavior of functional dependence (FD) as 

determined from 100 randomly generated data 

Sees £Or HPI Ss of two through ten for (a) the 

North Atlantic Ocean area 3W, 15 May-15 July 

1983, dependent data (1526 observations) and 

fey) tne Noren Pacific Ocean, July 1979, 

Pa eoU Co emcee UoZ eOoSeCEVarlOlnS)  —---—-—-—-—-—-—-—————— ean 


First stage contingency table statistics AAO, 

dependent data, and ATS1, independent data, 

lorena Pacinie Ocean, July 1979, as a function 

of the number of EPI's, from the Preisendorfer 
methodology --------------------------------------- eats 


Contingency table statistics AAO and ATS1 for 
both dependent and independent North Pacific 
®cean, July 1979, data as a function of the 
number of predictors in the model for 
strategies (a) MAXPROBI1 and (b) MAXPROB2, 


TmewepEecaheronrs Cach divided anto five EPI's ------ 142 
Same as Fig. 5, except for the North Atlantic 
Ocean area 3W, 15 May-15 July 1983 ~-------<-------- ers 
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Beta Or set “agi 96) CC ee Or, a, (05), PP 
(SG) ane PP (05) from wire randomly generated 

data sets, using predictors from the North 

Atlantic Ocean area 3W experiment, with each 

predictor divided into four EPI's, for (a) as 

each predictor is added and (b) as the forecast 

array size increases ~<<<<<<-~-<---<------------------ 144 


Same as Fig. 8, except each predictor is 
eae al Seem ne heme ENE a Geer ee 145 


Contingency table statistics AAO and ATS1 for 

both dependent and independent North Atlantic 

Ocean area 3W, 15 May-15 July 1983, data, without 

linear regression equations as predictors, as a 

function of the number of predictors in the 

model for strategies (a) MAXPROB1 and (b) MAX- 

PROB2, with predictors each divided into eight 

EPIL'S sere rr rr rr rr re ccs 146 


Same as Fig. 10, except predictors each divided 
ert ee 1G se 147 


Contingency table statistics AAO and ATS1 for 

both dependent and independent North Atlantic 

©cean area 3W, 15 May-15 July 1983, data, with 
linear-regression equations as predictors, as a 

HENGEION Of the number Of predictors in the 

model for strategies (a) MAXPROB1 and (b) MAX- 

PROB2, with predictors each divided into four 

EPIL'S errr rr rere Tace 


Same as Fig. 12, except predictors each divided 
ciao) ew eS) 0 ne 149 


Bivariate plot of EHF as a function of both 
equally populous intervals and visibility 
Bea ee eee 150 


Joint and marginal probabilities of VISCAT's 
as a function of EPI's for EHP -------------------- 150 


Conditional probabilities of VISCAT's as a 
PCM lO goers GOtas Sb 151 


Sample calculation of the average visibility 
category (VISCAT), natural-regression strategy, 


mom the first EPI (i1=1) of predictor EHF --------- at 
pemple calculation of potential predictability 

(PP) of visibility by predictor EHF --------------- 152 
Skill diagram with lines of constant Sige See an 153 
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Bivariate plots, Conditional probabilities, PP's 
and skill scores, maximum-probability strategy, 
Po eee ane ee atic Ss) be — 


Tabular presentation of a three-dimensional 
problem with predictors EHF and RH, each divided 
into three EPI's, as a function of VISCAT's 
MiG@msecuectmhen Oe tne problem FO two dimensions ---- 


Tabular presentation of a four-dimensional 

problem with predictors EHF, RH and FTER, each 
divided into three EPI's, as a function of 

VISCAT's and reduction of the problem to two 

Bar ce IOS ee 


Sample calculation of functional dependence 
ey OR ee cee eee ee 


Example of incremental marginal probabilities 

for a bivariate predictor, derived from Fig. 

min, amd Un@iomn probabilities for VISCAT's, 

used to generate random data sets by monte- 

CS) et 


JE: 


154 


pS 


i 


sae, 


160 





L. INTRODUCTION AND BACKGROUND 


Model output statistics (MOS) is a technique whereby 
parameters output from numerical weather prediction models 
(predictors) are statistically processed, with observed 
data, to produce forecasts of one of the following cate- 
gories of parameters (as predictands) : 

a. operationally important parameters not output by the 
MuMeLict wep Eealretaon model (e.q., visibility, cloud 
Cover, céeildang) ; 

b. model output parameters whose predictive skill is 
improved (e.g., surface wind, temperature) due to 
correction of numerical model bias and/or scale. 

Historically, the methodology has consisted of generating 
empirical equations by a linear, least-squares regression 
model. This technique is used by both the National Weather 
Service and the United States Air Force Air Weather Service 
and has demonstrated operationally usable skill in forecast- 
ing numerous weather elements at locations over land 
throughout the world [Best and Pryor, 1983]. Attempts by 
the United States Navy to forecast open-ocean fog and visi- 
bility using linear regression equations have shown skills 
of marginal operational usefulness but exceeding those of 
persistence and climatology [Aldinger, 1979; Yavorsky, 1980; 


Selsor, 1980; Koziara et al, 1983; Renard and Thompson, 1984]. 
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aon ects —bevyell of perrormance is due, in part, to 
eme Lack of ‘calibrated’ fog and visibility observations. 
Shipboard weather observers lack sufficient reference points 
to be able to accurately estimate the range of atmospheric 
Wapcdibility. 

In the spring of 1983, the United States Navy made the 
decision to begin development of a MOS program to forecast 
Operational air/ocean parameters over the oceans of the 
world. Primarily, because of the importance of horizontal 
visibility to the mariner, this parameter was elected to be 
the initial candidate. However, because of less-than-perfect 
prior results using linear regression in the North Pacific 
Ocean, it was decided to investigate other methodologies 
to determine if a better one could be found. 

This study presents statistical methodologies proposed by 
Preisendorfer (1983 a,b,c). Specifically, three strategies, 
two based on maximum-probability and one based on natural- 
regression, are further developed, tested and applied to sets 
of model output parameters from both the North Pacific and 
North Atlantic Ocean areas. In addition, multiple linear 
regression is applied to the same data. Innovative threshold 
techniques, developed by Lowe (1984a), are also applied, and 
methodologies are compared. 

In the following discussion, a sufficient number of terms 
and symbols are defined to allow readers without strong 


Statistical backgrounds to understand the results. However, 


iS 





for a proper understanding of the Preisendorfer (1983 a,b,c) 
methodology, readers are encouraged to read Appendix A, 
which contains a detailed discussion. Similarly, details on 


the linear regression model and threshold procedures [Lowe, 


1984a) are to be found in Appendix B. 
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mi ob wert Ven) APPROACH 


The objective of this study is to determine if a statis- 


tical methodology, applied to discrete values of model 


output and derived parameters, can improve upon the fore- 


casting of horizontal marine atmospheric visibility when 


compared to linear regression. The approach is as follows: 


a. 


define categorical groupings of visibility which 
relate to operational use at sea. 

develop and apply the Preisendorfer (1983 a,b,c) 
methodology using July 1979 North Pacific Ocean data. 
apply the methodology developed in b. above to June 
1983 North Atlantic Ocean data. 

compare Bee eAroer oy (1983 a,b,c) results to those 
of the Lowe (1984a) linear regression approach for 


the North Pacific, and North Atlantic Ocean data sets. 


ly 





III. DATA 


Pewee vLOLBILITY CBSERVATIONS AND SYNOPTIC CODE 

Visibility observations at sea are reported as one of 
ten synoptic codes, ranging from 90 (visibility less than 
Bpuem to 99 (visibility equal to or greater than 10 km). 
However, in view of the inexactness of observing and record- 
ing marine visibility, in category form, and the further 
degradation of its interpretation by users in forecasting, 


a simplified categorization of visibility was developed as 


follows: 
Gategory synoptic code visibility range 
z 90-94 <a m 
aa 95-96 Ceeaemeande <— LOekin 
Mia I. og pO ein 


This scheme is based upon the following operational 
criteria, which applies when observed visibility falls below 
the indicated value: 

mee 2.0 km (5 m mi)--United States Navy aircraft carrier 
flight recovery operations change from visual to con- 
trolled approach [Department of the Navy, 1979]. 

Meee 2 km (1 n mi)--sSounding of reduced visibility signals 
for all vessels operating in international waters. 


(The term ‘reduced visibility' is not defined in the 
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International Regulations for Preventing Collisions at 
Sea, 1972. However, United States Navy Captains and 
Merchant Marine Masters generally consider it to be 


Pe nems ..) 


B. NORTH PACIFIC OCEAN DATA 

The data from the North Pacific Ocean are described by 
Selsor (1980) and Koziara et al (1983). Only the July 1979 
model initialization (TAU00) data are used, consisting of 19 
model output parameters (MOP) from the Northern Hemisphere 
models operational in 1979, namely, the Mass Structure Analy- 
sis, the Primitive Equation and the Marine Wind Models; and 
one climatological visibility parameter from the National 
Oceanic and Atmospheric Administration's National Climatic 
Data Center (NCDC), Asheville, North Carolina. Two additional 
Pememeters were derived from this set. A description of the 


parameters is found in Appendix C. 


C. NORTH ATLANTIC OCEAN DATA 

1. Area 

The North Atlantic Ocean, from 0° to 80°N, was 

divided into physically homogeneous areas by Lowe (1984b) 
using an appropriate cluster analysis technique. The primary 
area used in this study is identified as area 3W on Fig. l, 
which illustrates the North Atlantic Ocean homoegeneous areas. 
This area was chosen because of the relatively frequent 


occurrence of poor visibility as compared to the other areas. 
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A summary of visibility frequencies, for each homogeneous 
area and three visibility categories, 1s contained in Table I. 
2. Time Period 

Data from 15 May 1983 through 15 July 1983 were 
combined to form the June 1983 data set, hereafter referred 
to as FATJUNE. FATJUNE was chosen as the initial data set 
because of the high frequency of occurrence of poor visi- 
eeeey Guring this period. In order to maximize the credi- 
Peeaty Of visibility observations, 1200 GMT synoptic ship 
report data were used exclusively since this time corresponds 
to daylight over the entire area of study during FATJUNE. 

Model output parameter data (predictors) at 1200 GMT 
model output time, hereafter referred to as TAU00O, were used 
in the development of the Preisendorfer (1983 a,b,c) methodology, 
time not being available to pursue the scheme beyond that 
stage. Thus, TAUOO represents model initialization time. 
However, the term ‘forecast! will be used throughout this 
study to represent the estimate of visibility at this 
foperalization time. 

Po owOneLc Weather Reports 

All synoptic visibility observations (predictand 
data) for this study were quality-control checked and pro- 
vided by the Naval Oceanography Command Detachment (NOCD) 
co-located with the NCDC. Those furnished observations which 
contain systematic observer error or are suspect or obviously 
erroneous, as determined from the data quality indicators, 


are not incorporated in the final data set. 
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4, Predictor Parameters 

Fifty TAUO0O model output parameters (MOP's) (predic- 
tor data) were provided for the period of study by the Fleet 
Numerical Oceanography Center (FNOC), Monterey, California. 
These parameters are from their current operational prediction 
model, the Navy Operational Global Atmospheric Prediction 
System (NOGAPS). All MOP's were interpolated from model grid 
coordinates to synoptic ship observation positions using a 
linear interpolation scheme. Of the 50 parameters provided, 
only 35 were used in the development of the Preisendorfer 
(1983 a,b,c) and Lowe (1984a) methodologies, the remainder 
being considered as either having little likelihood of 
importance in the forecasting of visibility or not usable 
due to the lack of significant digits (which were lost during 
the transfer from FNOC tapes to the main computer center's 
mass storage data system). Twelve additional parameters were 
derived from the interpolated MOP's. Seven of these are 
equations derived from a linear regression model which will 
be described in Chapter V and Appendix B. Each equation 
represents an estimate of the visibility category, which is 
used as a predictor. A list of all of the predictor param- 


eters is provided in Appendix D. 


DD. DEPENDENT/INDEPENDENT DATA SETS 
Due to the limited amount of data available to this 
study for each of the North Atlantic Ocean homogeneous 


areas, 1t was necessary to withhold one-third of the 
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observations from the developmental model to use as an inde- 
pendent data set. This was accomplished by the use of a 
counter and transfer statement in the computer programs which 
prevented every third observation from entering the develop- 
mental computations. To ensure that the dependent and inde- 
pendent data were representative of the same population, a 
95% confidence interval for proportions [Miller and Freund, 
1977} was established from the entire data set, for each 
visibility category, and the dependent and independent data 
sets were constrained to have visibility frequencies within 
these established confidence intervals. This same procedure 
was applied to the North Pacific Ocean data for consistency of 
method. Table II summarizes the dependent and independent 
data for both the North Atlantic Ocean and North Pacific 


Ocean data sets. 
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A. 


i Venwen eee MINARY EXPERIMENTS 


TERMS AND SYMBOLS 


The terms and statistical symbols defined below will be 


used throughout the remainder of this report. The formal 


mathematical definitions can be found in Appendices A and 


EP 


Maximum-probability strategy--choosing forecast 
visibility categories based upon the highest conditional 
Peobabilities Of visibility within a predictor interval. 
MAXPROB1--designation of the maximum-probability 
Strategy in which ties of the highest conditional 
probabilities in a predictor interval are resolved by 
the generation of a random number. 
MAXPROB2--designation of the maximum-probability 
strategy in which ties of the highest conditional 
probabilities in a predictor interval are resolved by 
asSigning the lowest visibility category, of those 

tied, as the forecast category. 

Natural-regression strategy--choosing forecast visi- 
bility categories based upon the statistical average 
Geeeme Conditional probabilities of visibility within 

a predictor interval. 

ay--the probability of a zero-class visibility category 
mOreCaASt E€rror (€.g., 1£ visibility category I is fore- 


cast, it is also observed). 
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aD... 


JL 


Ikea 


1 . 


a,--the probability of a one-class visibility category 
forecast error (e.g., if visibility category I is 
forecast and category II 1s observed). 

a,--the probability of a two-class visibility category 
Morecasemenronr (C.G., li visibility category I is 
forecast and category III is observed). 

CE--class error parameter defined as a, + 2a5, used to 
identify the first predictor. 

PP--the potential predictability of visibility by 

any given predictor. 

Pie eeu onats Gc penaenee Of One predictor on 
another. This is a measure of functional dependence 
of a statistical kind and not of the deterministic 
Kind. The term “functional dependence’ is used by 
Preisendorfer (1983c) and, being sufficiently descrip- 
tive of the concept, it will be used herein. 

RSS FD--root sum squared FD. The functional dependence 
of a predictor on all predictors already included in 
the developmental model. It is equal to the square- 
root of the sum of the squares of the individual FD's. 
TSl--threat score for visibility category I computed 
from a contingency table. 

ArsSl--adjusted threat score for visibility category 

I which removes the influence of the data set category 


ae quency . 
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14. 


a. 


ie 


AAO--adjusted a A contingency table statistic 


0° 
which removes the influence of the most frequent visi- 
bility category ina set of data (Similar to a nor- 
malized value). 


EPI--equally populous predictor interval used to 


discretize the predictors. 


COMPUTER PROGRAMS 


Four computer programs were developed to test the 


proposed Preisendorfer (1983 a,b,c) methodology. The 


programs are on file in the Department of Meteorology, Naval 


Paeegraduate School, Monterey, California, 93943. 


1 


A program to compute a CEeand PP tonsa ll predic— 


ea ly 
tors, all strategies (MAXPROB1, MAXPROB2 and Natural- 
Regression) and a single number of EPI's. Statistics 
for the three strategies are based upon the same pre- 
dictor(s) rather than the best predictor(s) for each 
strategy. It was determined during program development, 
and will be shown in Chapter VI, that, in general, each 
of the strategies chose the same predictor(s). 

A program to compute FD for all predictors, on a given 
Brearetor, £Or a given number of EPI's, and to compute 
the upper 5% critical value (FD(96)) by Monte-Carlo 
means (Appendix A). 

~ program tO construct contingency tables and to com- 
pute skill and threat scores, for both the dependent 


and independent data. 
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4. A program to generate 100 random data sets, from the 
Marginal probabilities of the predictor(s) in the 
developmental model, and to compute upper and lower 
5% Seo values for ao and ay to be used for test- 


ing the significance of the results from the Preisen- 


dorfer (1983 a,b) methodology against chance. 


C. BEHAVIOR OF ao AND THREAT SCORES 

Before attempting a formal application of the Preisen- 
dorfer (1983 a,b,c) methodology, it was considered prudent 
to investigate the behavior of certain statistics as the 
number of equally populous predictor intervals was changed 
and as new predictors es added. It was found, during 
program testing and before a formal procedure had been estab- 
lished, that the independent data threat score of visibility 
category I (TS1) generally showed higher values than other 
threat scores (TS2, TS12) for the independent data. There- 
fore, it was decided that the dependent and independent data 


a, and TS1l scores would be compared. The statistic ay was 


0 
chosen because it is the singularly most important scoring 
Pemamncter in the Preéisendorfer methodology. 

The experiment consisted of choosing the first predictor 
as that one which gave the highest Ao value when divided 
into ten equally populous intervals. Once this predictor 
was chosen, dependent and independent data ao and TSl scores 
were computed for each number of intervals as the number was 


varied from two to 100. Prior to proceeding to the next 
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step, the number of intervals which gave the highest indepen- 
dent data TSl1 score was identified and the first predictor 
was held at this number of intervals for the remainder of 
the experiment. 

Subsequent predictors were chosen by both a maximum ay 
test and a functional dependence test. As each subsequent 
predictor was identified, its number of equally populous 
intervals was varied from two to 50 (or less, as the maximum 
array Size was set at 120,000). The number of equally popu- 
lous intervals giving the highest independent data TS1 was 
identified and held fixed for the following stage. This proce- 
dure was repeated until either six predictors were used or 
until a new predictor addition did not allow the comparison 
of at least intervals two through ten, due to computer 
storage limitations. It should be noted here that all of 
j@me North Atlantic Ocean parameters, not including linear- 
regression equations, were used in these experiments and, 
subsequently, some parameters were removed from consideration 
(Appendix D). 

1. Maximum ap Method 

The first NOGAPS predictor selected was SMF which 
was varied from two to 100 EPI's (Fig. 2a) and the highest 
TS1l score was obtained with six intervals. The second pre- 
dictor chosen, when SMF was held at six intervals and all 
others at ten, was DTDP which produced the highest Ay value 


fOr two predictors. Holding SMF at six intervals, DTDP was 


varied from two to 50 intervals (Fig. 2b) and the highest 
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TS1 score was obtained at 20 intervals. Anticipating problems 
with the subsequent array size with respect to the number of 
predictors which could be included, the secondary TS1 maximum 
at 16 intervals was used for further stepping. The third and 
subsequent predictors and their optimum interval sizes were 
Bemet 12 (Fig. 2c), UBLW at ten (Fig. 2d) and V400 (Fig. 2e). 
The optimum number of intervals for V400 was not germane as 
no further stepping was done after this step. As illustrated 
in Fig. 2, the dependent data statistics aymptotically approach 
unity, as predictors are added, while the independent data 
statistics (approximate maximum values: ay = SOpmeno t= +.35) 
show no further increase after the third predictor is includd, 
which may imply a limit as to how well the methodology per- 
forms on this particular data set. 

2. Functional Dependence Method 

As functional dependence is not considered until after 

mmemeoe ection of the first NOGAPS predictor, Fig. 2a is also 
applicable to this method. Subsequent predictors were chosen 
as those having the lowest RSS FD using ten equally populous 
intervals. The predictors selected and their optimum inter- 
val sizes, for the TSl1 score, were RH at three (Fig. 3a), 
Memeeat four (Fig. 3b), VOR925 at two (Fig. 3c), ENTRN at 
14 (Fig. 3d) and UBLW (Fig. 3e) which was the last predictor 
considered. As seen for the maximum a,method, the dependent 


data statistics asymptotically approach unity. However the 


independent data statistics continue to grow at least through 
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the addition of the sixth predictor (approximate maximum 
values: ay = .71, TSl = .38). This method gave better results 


than the maximum a, method, though it, too, may imply a 


0 
limit. The results of this experiment also tend to show a 

preferential selection of a small number of EPI's, for best 
independent data TS1 score, as well as indicating that func- 


tional dependence is a relatively good choice as a deciding 


Gactor for choosing predictors. 


De BEHAVIOR OF FUNCTIONAL DEPENDENCE 

Another statistic investigated prior to the formal 
application of the Preisendorfer (1983 a,b,c) methodology 
was the distribution of functional dependence (FD) calculated 
from 100 randomly generated data sets. The FD calculation is 
based upon the relationship of the distribution of one pre- 
dictor to another. Because the predictors are divided into 
the same number of EPI's for the calculation, the probability 
of a randomly generated number falling into any given inter- 
val for either predictor will be the same. Therefore, the 
randomly generated FD values should be a function only of 
the number of intervals and the number of data cases (subse- 
quent randomly generated calculations, during the formal 
meerication of the methodology, showed this to be true). 

The randomly generated FD experiment consisted of com- 
eweanmg the mean, upper and lower 5% critical values, and the 
Seancard deviation of the 100 randomly generated values for 


both 1526 observations (as in the North Atlantic Ocean Area 
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3W dependent data) and 3682 observations (as in the North 
Pacific Ocean dependent data) and a comparison of the 
results. As illustrated in Fig. 4 the FD values are similar 
for a given interval size differing only in the size of the 
confidence interval and the standard deviation. The FD 
values calculated for 3682 observations lie totally within 
the upper and lower 5% critical values for 1526 observations. 
Because of this relationship, future FD(96) values, used to 
qualitatively determine how well a new predictor will con- 
tribute to the developmental model, can be obtained by read- 
ing from the graph rather than using valuable computer 
resources, providing the number of equally populous intervals 


1s less than or equal to ten. 
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Vee COC DURES 


A. PREISENDORFER METHODOLOGY 


1. Determination of the First Predictor in Relation 
to the Number of Predictor Intervals 


A matter not considered in Preisendorfer (1983 a,b,c) 
is how to chose an optimum number of equally populous pre- 
dictor intervals (EPI's) into which predictor data should 
be divided. During the course of development, two important 
realizations became evident, namely, (a) there is a tendency 
for the methodology to give better results using a small 
number of intervals, and (b) the NPS W.R. Church Computer 
Center limits internal computer storage space to two mega- 
bytes for routine programs. The first suggested, while the 
second forced, the research to be limited to EPI's of less 
Eien Or equal to ten if more than three or four predictors 
were to be considered. Once this was established, a proce- 
dure was developed to look at all EPI's within the stated 
ieemat . 

The procedure involves computing the initial statis- 
ELCS (ay. ays CE and PP) for each predictor, for each strategy 
(maximum-probability and natural-regression) and for EPI's 
@aecwO through ten. Then, the best first predictor for each 
number of EPI's is determined, for each strategy, by meeting 
one or both of the following conditions, when considered in 


the indicated order: 
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a. lowest CE 
bee highest PP 
Once the best predictor for each number of EPI's is 
known, it is then necessary to determine the optimum number 
of EPI's. This is accomplished by computing threat and skill 
scores (Appendix E) for both the dependent and independent 
data and choosing, as the optimum number of EPI's, that which 
gives both a relatively high adjusted Ao (AAO) for the depen- 
dent data and a relatively high adjusted threat score for 
Visibility category I (ATS1) for the independent data. This 
becomes a somewhat subjective endeavor and remains as the 
only imprecise step in the methodology. 
The statistic ATS1 is used on the independent data, 


instead of a because it is the poor visibility categories 


0° 
(I and Poi ware Of primary LoOLecast interest and their 
Forecastability is manifested in their threat scores. It 
will be shown that, in general, the adjusted threat score 
Beeeyvisibility category II (ATS2) and for combined visibility 
categories I and II (ATS12) are small compared to ATSI1, or 
negative, and that ATS12 is maximized when ATS1 is maximized. 
Additionally, it will be shown that maximum ay does not 
necessarily coincide with maximum ATS1 in the independent 
@eaea. Hence, if ay was used, the optimum combination of 
predictors necessary to forecast the poor visibility cate- 
gories would not be included. 


Once the number of EPI's is established, it is fixed 


for all subsequent predictors considered for the developmental 
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model. Holding the number of intervals fixed is not an 
absolute necessity, however it allows for a much more rapid 
development of the model. Once this number is determined for 
mmenrtirst predictor, it is used to calculate FD for the next 
predictor because FD is calculated using the established 
number of EPI's. The next stage statistics (ay, ays CE and 
PP) are also computed with each predictor divided into this 
Same number of EPI's. 
ecioosing the Second Predictor 

The second predictor to be included in the model is 

determined from its FD on the first predictor and from the 


increase ina, resulting from its inclusion. This is accom- 


0 


plished by computing a, with two predictors, namely’, the 


0 
meest predictor, as determined above, with each of the 
remaining predictors. Those predictors which do not increase 


a, above its value as determined with the first predictor 


0 
alone, are removed from further consideration for inclusion 
into the set of predictors in the developmental model. FD 
meeeeach of the remaining predictors vs. the first predictor 
1s computed. The remaining predictor with the lowest FD, 
on the first predictor, is chosen as the second predictor in 
the model. 
3. Choosing Subsequent Predictors 

Suecequcme Prediceemaeadelterminatien 2s Similar to the 


second predictor determination. Compute a, with N predictors 


0 
(N = 1,...,M+tl; M = the number of predictors already in the 
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developmental model), that is, the first through Mth pre- 
dictors, as previously determined, and each of the remaining 
Peeeeictors. Those predictors which do not increase Ap above 
its value as determined with M predictors are removed from 
Pier ner consideration. RSS FD is computed for each of the 
remaining predictors and the one with the lowest RSS FD is 
chosen as the Nth predictor in the model. 
eo cimentcance Tests 

After each stage (i.e., after each new predictor to 
be included in the developmental model is determined) it is 
necessary to determine if the results are significant. This 
is accomplished by Monte-Carlo means using the data set 
marginal probabilities of the predictors and assuming equal 
probability of occurrence for visibility categories (Appen- 
dix A). The seneet ice Ao and a,-are computed for each of 
100 randomly generated data sets of a size equal to the 
number of observations in the dependent data set being tested, 
and sorted from lowest to highest. The 96th value of ay 
(a, (96)) anew enewr Ie chvalue: Of ay (a, (05)) are retained as 
the upper and lower 5% critical values. For developmental 
Meeel results to be significantly better than chance, a 


0 
must be greater than or equal to ay (96) and ay must be less 
than or equal to a, (05). 

5. Terminating the Selection of Predictors 


Model development continues until any one of four 


Bemaitions are met: 
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a. no more predictors remain to be considered. 

b. results are no longer significant. 

c. required computer region size exceeds that which is 
allowed (two megabytes at the NPS W.R. Church Computer 
Center). 

dad. independent data ATS1 does not increase for two 
consecutive predictor additions. (It will be shown 
that there is a point in the development of the model 
where the skill and threat scores for the dependent 
data diverge sharply from those for the independent 
data. This condition for terminating model development 
is a subjective attempt at taking this point into 
consideration.) 

Once the model development is complete, contingency 
tables of forecast visibility categories vs. observed visi- 
bility categories, for both the dependent and independent 
data, are constructed. From the contingency tables, threat 


and skill scores for both data sets are computed and compared. 


B. COMPARISON METHODOLOGY 

The results obtained from the Preisendorfer (1983 a,b,c) 
methodology were compared to two variations of a linear, 
least-squares regression model. The model chosen for the 
comparison is that available in the BMDP Statistical Software 
(namely BMDP2R) [University of California, 1981] using two 
new threshold schemes developed by Lowe (1984c) (Appendix B). 


The equations developed by BMDP2R include all predictors which 
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increased R-squared (the proportion of the predictand vari- 
ance explained by the estimation of the predictand from the 
multiple regression equation) by at least 1%. An excellent 
description of this procedure is given by Best and Pryor 
(1983), with R-squared being equivalent to their R-value. 
1. Method 1 

The first linear regression method consists of 
generating a single equation, trained on the dependent data, 
with the predictand set equal to l, 2 or 3, corresponding to 
visibility categories I, II and III, respectively. This 
equation is used:-to determine threshold values (Appendix B) 
and is then applied to the independent data. 

Ze Method 2 

The second linear regression method is based on a 
decision-tree scheme using two linear-regression equations 
trained on the dependent data. The eiget equation is 
generated with the predictand values set equal to zero or 
one, corresponding to combined visibility categories I and 
II (0) and visibility category III (1). The second equation 
is generated with the predictand set equal to zero or one, 
corresponding to visibility category I (0) and visibility 
category II (1). Visibility category III observations are 
ignored during this linear regression. Threshold values are 
then computed for each equation. 

When both equations and their associated threshold 


values are known, the independent data set is sorted into 
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visibility category III and visibility category ‘other' by 
the first equation, and the ‘other’ category is sorted into 
visibility categories I and II by the second equation. 
Following the development of linear regression method 1 and 
method 2, contingency tables are constructed, skill and 
threat scores computed, and comparisons made with the results 


from the Preisendorfer (1983 a,b,c) methodology. 
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Ve ours 


Pee NORTH PACIFIC OCEAN 
1. First-Predictor Selection and Interval Determination 
The first predictor selected, for equally populous 
intervals (EPI's) of four through ten was EHF (Table III). 


The constant value for a maximum-probability strategy, 


ane 
indicates that there is no predictability for visibility 
category II (the least frequent category in the data set) 
uSing a single predictor. A comparison of the dependent 


data adjusted a, (AAO) and independent data adjusted threat 


0 
score for visibility category I (ATS1) subjectively deter- 
mined the selection of five EPI's for the developmental 
Medel (Table IV; Fig. 5). } 
2. Selecting Subsequent Predictors 

Once the number of intervals and first predictor 
were known, a new ay computation was made with the first 
predictor and each of the remaining predictors. Only six of 
the remaining 21 predictors, CLIMO, SEHF, THF, DDWW, H510 


and RH, in combination with EHF, gave new a, values greater 


0 
than that for EHF alone (.697); these comprised the pool of 
predictors to be considered for further development of the 

model. Functional dependence (FD) with EHF was computed for 


each of these six predictors and DDWW was chosen as the second 


predictor because it had the lowest FD. 
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Hoar EneeGdetermination Of the third through sixth 
predictors, a new ay was Cenpuced asmameumceion Of all of 
the previously selected predictors and each of the remaining 
meeaqictors. At each stage, the new ay computation for each 
remaining predictor was greater than that for the prior 
stage, so no further predictors were eliminated from con- 
sideration. FD was then computed, for each of the predictors 
being considered with each of the predictors previously 
selected, and RSS FD determined. At any given stage (tnree 
through six) the new predictor added to the developmental 
model was that one with the lowest RSS FD. The third through 
sixth predictors, in order of selection, are H510, RH, THF 
Pa@eCLIMO (Table V). 

3. Determining the Final Model 

The final model for the Preisendorfer (1983 a,b,c) 
methodology was determined by comparing the independent data 
contingency table statistics, from each developmental stage, 
and choosing the fourth stage because it gave the highest 
adjusted threat score for visibility category I (ATS1) 
(Fig. 6). The contingency tables for stage four and the 
related statistics for the three strategies are shown in Table 
ial. 

4. Linear Regression 

A single linear-regression equation was developed 

from the North Pacific Ocean data using method 1. Both the 


quadratic and equal-variance threshold models (Appendix B) 
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mere Be eibied Sue on yveciomemncsnold values from the equal- 
variance model were used to compare methodologies. Table 
VII contains the linear regression equation, the visibility 
category linear regression statistics and the threshold 
values. Contingency tables and related statistics for the 
dependent and independent data are shown in Table VIII. 
S.) Discussion 

The best results obtained from the North Pacific 
Ocean data were from the Preisendorfer (1983 a,b,c) methodology, 
MAXPROB2 strategy, as it has the highest independent data 


adjusted threat scores for visibility categories I and com- 


bined I/II (ATS1 = .20, ATS12 = -.05). Each of the maximum- 
probability strategies (MAXPROB1l: ATS1 = .17, ATS12 = -.10) 
Sercmetter than linear regression (ATS1 = .16, ATS12 = ~.13), 
while natural~regression shows the poorest skill (ATS1 = -.02, 


ATS12 = ~.19). 

It appears, from Fig. 6, that most of the usable 
forecastability resides in the first predictor chosen. This 
Would indicate that it may be profitable to search for 
better predictors by combining model output parameters, 
conducting dimensional analysis or using linear~-regression 
equation estimates as predictors as was done in the North 


Atlantic Ocean experiments which follow. 


B. NORTH ATLANTIC OCEAN AREA 3W 
Based upon the results obtained in the North Pacific 


Ocean, it was decided to use the linear regression model to 
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generate equations which could be used as predictors. Seven 
such equations were developed, each representing a different 
menu of parameters available to the regression model. The 
seven equations are included in Appendix D. The Preisen- 
@@mter (1963 a,b,c) methodology then proceeded both with 
and without these linear-regression equations available as 
mpeedictors. 
1. First Predictor Selection and Interval Determination 
a. Without Linear-Regression Equations as Predictors 
Mier Eoeapredictor, for EPL's o£ four through 
ten, varied with the number of intervals (Table IX). A 
comparison of the dependent data AAO and the independent 
data ATS1 determined the selection of eight EPI'‘'s for the 
Medel (Table X) and, therefore, SMF as the first predictor. 
However, through investigator error, the model was initially 
developed with five EPI'’s and E925 as the first predictor. 
Therefore, both results will be presented. 
b. With Linear-Regression Equations as Predictors 
The first predictor for each EPI of four through 
ten is BMl, the predictand estimate computed by the linear 
regression equation developed when all of the predictors 
were available to the regression model (Table XI). Two of 
BaeeiPi's, namely four and eight, have identical, and best, 
dependent data AAO and independent data ATS1 scores (Table 
XII, Fig. 7), so it was decided to proceed with the develop- 


mMemeal model for both intervals. 
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2. Selecting Subsequent Predictors 

Subsequent predictors were chosen in the same way as 
described in the procedures and for the North Pacific Ocean 
experiment. The predictors, not including linear regression 
equations as predictors, are SMF, D850, RH, UBLW and ENTRN 
memeeignt EPI's (Table XIII) and E925, U700, DVDP, STRTIFQ, 
ENTRN and PS for five EPI's (Table XIV). The predictors, 
including linear regression equations as predictors, are 
Pee US>0, D500, V850, DILO00 and U1000 for four intervals 
(Table XV) and BMl, U500, ENTRN, DVDP and BM4 for eight 
intervals (Table XVI). Significance tests were made after 


each predictor selection and a, (96) and a,(05) values are 


qk 
included in Tables XIII,:XV and XVI. A comparison of the 
behavior of critical level statistics, as predictors are 
added, for both four and eight intervals, is shown in Figs. 
peema 9, where array size is equal to the number of EPI's 
taken to a power equal to the number of predictors included 
at that stage. 
3. Determining the Final Model 

The final model for the Preisendorfer (1983 a,b,c) 
methodology was determined by comparing the independent data 
contingency table statistics, from each developmental stage, 
and choosing that stage which gave the highest adjusted 


@eeadt SCore for visibility category I (ATS1). 


a. Without Linear Regression Equations as 
Predictors (Eight Intervals) 


It was determined, from Fig. 10, that the fifth 


stage gave the best results (MAXPROB1, independent data: 
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Peer — lo, Alo2 = -03, ATSI2 = -.05). The contingency tables 
for stage five and related statistics for the three strategies 
are shown in Table XVII. 


b. Without Linear Regression Equations as 
Predictors (Five Intervals) 


It was determined, from Fig. ll, that the fifth 
stage gave the best results (MAXPROB2, independent data: 
ATS] = .25, ATS2 = .02, ATS12 = .01). The contingency tables 
for stage five and related statistics for the three strategies 
are shown in Table XVIII. 


c. With Linear Regression Equations as 
Predictors (Four Intervals) 


It was determined, from Fig. 12, that the fourth 
stage gave the best results (MAXPROB2, independent data: 
ATS1 = .40, ATS2 = -.05, ATS12 = .12). The contingency tables 
for stage four and related statistics for the three strategies 
are shown in Table XIX. 


d. With Linear Regression Equations as 
Predictors (Eight Intervals) 


It was determined, from Fig. 13, that the second 
stage gave the best results (MAXPROB2, independent data: 
ATS1 = .32, ATS2 = -.14, ATS12 = .02). The contingency tables 
for stage two and related statistics for the three strategies 
are shown in Table XxX. 
4. Linear Regression 

Both linear regression methods (single equation and 

decision tree) and both threshold models (quadratic and 


equal variance) [Lowe, 1984a] were used to compare with the 
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Preisendorfer (1983 a,b,c) methodology in the North Atlantic 
Ocean Area 3W. Additionally, the predictors available for 
regression were varied as indicated in the following descrip- 
tion. The first regression was conducted with all available 
MOP's while the second regression was conducted using only 
the best predictors from the Preisendorfer methodology (de- 
fined as those predictors which, alone, produced an ao value 
greater than the frequency of visibility category III in the 
dependent data). Table XXI contains the linear-regression 
equations, associated visibility category statistics and 
threshold values. Tables XXII through XXVII contain the 
contingency tables and related statistics for the dependent 
and independent data for each of the linear regression 
Pariations. 

5. Discussion 

Table XXVIII summarizes each of the methodologies and 

strategies applied to the North Atlantic Ocean Area 3W 
data. In general, the maximum-probability strategy did 
Memecr than the other methods or strategies. Specifically, 
the best results overall were obtained by the MAXPROB2 
strategy, using predictors computed from linear regression 
equations and four equally ae see intervals. The methodology 
without linear regression equations as predictors, and all 
of the linear regression results, are about equivalent. The 
best linear regression method is the decision tree, when all 


MOP's are made available to the regression model. The results 
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obtained without linear regression equations as predictors 
appear to discount the procedure established for choosing the 
number of equally populous predictor intervals, but lends 
Support to the claim in Chapter V that there is a tendency 
for the Preisendorfer (1983 a,b,c) methodology to give better 


results using a small number of intervals. 
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VII. CONCLUSIONS AND RECOMMENDATIONS 


The primary objective of this study was to determine 
if the Preisendorfer (1983 a,b,c) methodology applied to the 
FNOC NOGAPS model output parameters could improve upon the 
forecasting of atmospheric marine horizontal visibility, in 
three categories, when compared to the more traditional 
method of least squares, multiple linear regression. It was 
shown that, indeed, the proposed methodology, namely, the 
maximum probability strategy, was superior when predictand 
estimates, computed from linear regression equations 
themselves, were used as predictors. 

The method of determining the number of equally populous 
predictor intervals requires further investigation. The 
results from the North Atlantic Ocean area 3W, without 
linear regression equations as predictors, showed that the 
proposed method was not the best, in that the number of inter- 
vals determined by the method was eight but better results 
were obtained with five. Additionally, only intervals of 
ten or less were considered here, due to storage limitations 
imposed by the computer center. As a result, the optimum 
number of predictor intervals is inconclusive. 

Predictor determination appears to be adequate. At each 
stage of development a unique predictor was selected. The 
Only foreseeable problem is if, during the first (initial) 


Stage of development, multiple predictors have identical CE 
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and PP values, or, during subsequent stages, multiple pre- 
dictors have identical Ay and FD values. Should this occur, 
the model development would have to proceed, from that 
memsecular stage, with each of the identified predictors. 

The methodology appears to be sensitive, in two ways, to 
he first predictor selected. First, there is an initial 
large value for the independent data ATS1 and small incre- 
mental increases thereafter for each new predictor added. 
Secondly, there is a large magnitude difference in the 
initial independent data ATS1 values between the Preisen- 
dorfer methodology without linear regression equations as 
meearctors (ATSL = .13; .14) and that with linear regression 
equations as predictors (ATS1 = .30), for the maximum 
probability strategy. 

The best strategy is MAXPROB2, followed by MAXPROBI1, and 
then natural-regression. Generally, natural-regression does 
worse than linear regression. None of the methods did well 
in predicting visibility category II, which may indicate 
that visibility would be best handled as a two-category 
phenomenon. 

The number of independent data observations (1526) in 
North Atlantic Ocean Area 3W were sufficient to test the 
methodology. This was demonstrated by the similar results 
between Area 3W, without linear regression equations as 
predictors, and the North Pacific Ocean results (3682 


observations). The small differences in the contingency 
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table statistics for the independent data for the two experi- 


ments can be attributed to parameters being from different 


models and for different months. 


The following recommendations are offered for future 


research and to future researchers: 


fe 


Investigate the problem of determining the optimum 
number of equally populous predictor intervals. 
Possibly, a statistic similar to the threat scores 

or adjusted threat scores could be used, or, simply 
choose the interval, between two and ten, which gives 
the highest adjusted threat scores for the independent 
data. Alternatively, adopt, without further experimen- 
tation, the number of EPI's as five, which appears to 
be a compromise between a gross resolution of the 
predictor parameter range and a fine (but too expensive) 
resolution of Deer ccicror parameter range. 
Investigate the use of potential predictability (PP) 

in determining the selection of predictors. During 

the initial stage of development, PP is computed for 
all available predictors and provides a measure of 

each predictor's individual ability to forecast 
irsctolbity, but, 1t is mot used explicitly. Perhaps 
computing the mean and standard deviation of PP, 

during the initial stage, and removing from considera- 
tion those predictors which are not greater than a 


value equal to the mean minus one standard deviation, 
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or, Simply, not greater than the mean. This would 
ensure that only those predictors which have a rela- 
tively high prospect of forecasting visibility will 

be available for subsequent selection. 

Search for better predictors which are particularly 
suited to visibility prediction. Recommended sources 
are: new, direct and derived, model output parameters 
(including original model output); non-dimensional 
parameters derived from dimensional analysis; and 
boundary-layer parameters such as the optical structure 
function i) and extinction coefficients. 

Investigate a two-category visibility scheme. 

Install automatic visibility recorders on ocean-going 
military and civilian passenger/cargo ships. This 
will place visibility observations on a more objective 
basis and lead to improved methods of forecasting 
visibility, as well as verifying such forecasts. 
Investigate new prediction models, preferably those 
which attempt to manipulate the observed data to 
correct for probable observer bias (following Selsor, 
1980; Renard and Thompson, 1984). This would be 
unnecessary if recommendation 5 was acted upon. 
Investigate other ocean areas and seasons to determine 
if the physically homogeneous area scheme is consistent 
and viable. Develop prediction tables and other aids 


specifically tailored to region and season. 
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Use a statistic other than ATS1 for choosing the 
PigseeprEealetLon and fOr Comparing methods and strate- 
gies. It was used in this study largely because of 
its greater magnitude, as compared to ATS2 and ATS12. 
This was due to the relatively high frequency of visi- 
bility category I in both data sets. In general, this 
will not be the case. Because three visibility cate- 
gories are being considered, and good forecasts of 

the two poorest visibility categories is desirable, a 
statistic such as ATS12 would be better suited as a 
consistent comparison statistic for future researchers. 
As soon aS it is feasible, eliminate from further 
testing the MAXPROBI1 strategy*in order to allow for 
more efficient and faster program execution. The 
natural-regression strategy, though it gave the poorest 
results in this study, should be re-examined when 
predictands with relatively many discrete states 

(e.g., ceiling) are considered. It has, in such 
settings, potential to out perform the more rigid 


linear regression technique. 
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APPENDIX A 
PRO LOCUSSION OF THE STATISTICAL PROCEDURES PROPOSED BY 
eee hNDORFER (1983 a,b,c) FOR THE FORECASTING OF 


ATMOSPHERIC MARINE HORIZONTAL VISIBILITY USING 
MODEL OULTPUR IS ATISTICS 


fen Ole LON 


The following discussion is based upon three unpublished 
research papers by Preisendorfer (1983 a,b,c). His proposed 
methodology deals with a simple statistical manipulation of 
model output parameters (predictors) which have been trans- 
formed from continuous to discrete quantities by grouping 
each predictor into equally populous intervals. The proce- 
dural approach in applying his methodology to model output 
Beaeistics (MOS) forecasting, is as follows: 

1. Generate predictand/predictor pairs of data using the 
United States Navy Fleet Numerical Oceanography Center 
Navy Operational Global Atmospheric Prediction System 
(NOGAPS) model output (predictors) and synoptic ship 
Visibility observations (predictand) provided by the 
Naval Oceanography Command Detachment, Asheville, NC, 
and generate bivariate plots. 

2. Generate conditional probability tables based on the 
cesta toueecon Of the predictand/predictor pairs. 

3. Define prediction strategies based on the conditional 


probabilities. 
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4. Compute the potential predictability of visibility 
from the conditional probability tables. 

5. Compute skill scores of the prediction strategies and 
ehoose the first predictor. 

6. Repeat steps l, 2, 4, and 5, for multiple predictors. 

7. Compute functional dependence of selected vs. potential 
subsequent predictors. 

8. Choose the next predictor. 

wee Repeat steps 1, 2, 4, 5, 7, and 8, until model 
development is terminated. 

For demonstration purposes, an artificial data set of 
femeases, Consisting of four predictors plus visibility 
(predictand), will be used throughout this discussion. 

Each predictor parameter is divided into three equally popu- 
Meus intervals and visibility is divided into three categories, 
as illustrated in Table Al. The four predictors are | 
Evaporative Heat Flux (EHF), Fog Probability Parameter 

(FTER), Relative Humidity (RH) and Air-Sea Temperature 
Difference (ASTD). Visibility categories are defined by the 
marine visibility observation codes (MVOC) included in the 


categories. 
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TABLE Al 


ARTIFICIAL DATA SET 


Interval l Interval 2 tnitervyal 3 

Bae < 2.65 CeO Om Gr rin oes .44 EHF > 4.44 
BEER < .024 S024 <a < 9.9 Eee 29 

Bie < 55). 9 Sere ue oO RH > 90.0 
ASTD < 1.02 P02. << Aotpes 1.91 ASTD > 1.91 
Rasibility Category I: MVOC 90 -> 94 (60 cases) 
Mesibility Category II: MVOC 95 & 96 (20 cases) 


Pesability Category ITI: MVOC 97 -> 99 (19 cases) 


ewe lNGhE Ee REDLCTOR SLATISTICS 


A. BIVARIATE PAIRS 

Choose various visibility-predictor pairs and make 
bivariate plots of these pairs. this will provide immediate 
visual estimation of the potential predictability. As an 
example, let us suppose that predictor EHF of our artificial 
data set has 33 cases in each equally populous interval and 
that the visibility categories I, II and III are respectively 
represented by 17, 7 and 9 in interval 1; 1, 7 and 25 in 
interval 2; 1, 6 and 26 in interval 3. To make the bivariate 
plot, simply make a tabular summary of this information, as 
illustrated in Fig. 14. Now we define, from the bivariate 
plot, our coordinate system and nomenclature. Items in 
Parentheses are examples from Fig. 14, numbers in brackets 


are equation numbers from Preisendorfer (1983 a,b,c) with 


oS 





a letter designator indicating the paper from which it was 


obtained. 
n = number of visibility categories (n = 3) 
m = number of equally populous predictor intervals 
(m = 3) 
j= che vertical counting index (j = 1,...,n) 
ioe saver horizOneal counting index (1 = 1,...,m) 
mei, 7) = indavidual cell counts (n(1,3) = 9) 
m 
n(.,j3) = marginal predictand totals = ? ni, J) oc 
ale 
row totals (n{(.,2) = 20) [3.la] 
n 
n(i,.) = marginal predictor totals = ) n(i,j) = 
Da 
eormmmn cOtals {(miz,.) = 33) [3.2a] 
mi.,-) = total predictand/predictor pairs = 
n m 
ae ewe eaescun over all cells (n(.) = 99) 
j=1 i=l 
Penal 


B. CONDITIONAL PROBABILITIES 
From the bivariate pairs determine the conditional proba- 
mumetiey Of visibility given a predictor. We will continue from 


@eemoivyariate plot in Fig. 14, and define three probabilities: 


P15 (1,3) Sener Wie. i —eegount probability of a 
predictand=predictor pair occurring ina 
given cell = individual cell count 
divided by the total number of cases 
(pj 5 (3,3) = 26/99 = .2626) ieaas 
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P, (1) ete tice.)  —semarginal probability of 


predictor = column total divided by the 
total number of cases = the column sum of 
the joint probabilities 
(p, (2) = soyeo) = .333) emo | 

Po (3) =item 7ni.,.) = Marginal probability of 
predictand = row total divided by the 
total number of cases = the row sum of the 
joint probabilities (p. (2) = 20/99 = .202) 
[3.7a] 


We can now build a joint/marginal probability table as 


memistrated in Fig. 15, and define conditional probability. 


ee?) = P54 (4,5) /p (4) = (i,j) Mmi,.) = 
conditional probability of predictand given 
a peecerteror = a cell’s joint probability 
divided by the marginal probability of 
Pecererorme= Indivadual cell coume divided 
by column cota l 
(p,, (2/2) = 01) 253) ys SP ley 
[3.8a] 


Now build a conditional probability table as aie er aced 
@emeag. 16. Conditional probability of visibility, given 
some predictor, is the quantity of greatest interest in this 
study. Note that if Po, (5/4) SC ay melee qf ll aa olathe 
some i (1.e., each cell contains 1/n of the cases in its 
column), then very little information is available to predict 
femeibility at that i. However, if P57 (59/4) = 1 for some 
jp and Po, (5/4) = 0 for all other j values, then there is 
perfect predictability of class Jo by the predictor at class 
i. The underlying methodology of this study will be to 
Gerermine the maximum conditional probability of visibility 


fOr each predictor value. 
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C. STRATEGIES 
Preisendorfer (1983 a,b,c) presents three different 
prediction strategies, two based on maximum probabilities 
(MAXPROB] and MAXPROB2) and one based on natural regression. 
1. Maximum Probability 

This strategy consists of determining the cell, ina 
given column, with the highest conditional probability, and 
assign to the column the visibility category associated with 
that cell. As each column represents an interval of predic- 
tor values, we now have a visibility forecast value associated 
Meee that ianterval. In our example with EHF (Fig. 16), 
interval 1 (i = 1) will have a forecast value of visibility 
Sameegory I (VISCAT 1). Hence, if we used only EHF as a 
predictor, every time a value of EHF was encountered with a 
Meeue < 2.65, we would predict visibility category I. Simi- 
Maery, for interval 2 (1 = 2) and for interval 3 (1 = 3) 
we would choose visibility category III (VISCAT 3). 

MAXPROB1 and MAXPROB2 differ only in the way they 
handle a tie between maximal conditional probabilities in 
a column. Should this occur, then a decision must be made 
as to which predictand category will be assigned to that 
predictor interval. In MAXPROBI]1, this decision is made by 
a coin toss, figuratively. A random number, in the unit 
interval, is generated. The unit interval is divided into a 
number of subintervals equal to the number of tied values 


and each subinterval is assigned to a specific predictand 
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category. The subinterval into which the random number 
falls determines the forecast visibility category. In 
MAXPROB2, the lowest predictand category, among the tied 
categories, is chosen. 

fee Natural Regression 

This strategy consists of first finding the average 

Seeaictand (visibility category) for each predictor interval, 
using conditional probabilities, and then choosing the 


predictand category nearest the average. 
ns n 
ee ee) gee) [72 1b) 
j=l 


Fig. 17 shows the computation for EHF interval l1 (1 = 1). 
Visibility category II (VISCAT 2) would be assigned to this 


interval by this strategy. 


eee COMPARISON STATISTICS 

To determine if a predictor will be useful in forecasting, 
there should be a statistic with which to compare its poten- 
tial utility. Preisendorfer (1983 a,b,c) defines four such 
statistics and their critical values. The four statistics 
defined are potential predictability (PP), class-error 


Meooabilities (ag,a and functional dependence (FD). 


a 
Potential predictability and class-error probabilities will 
be defined now. Functional dependence will be addressed 


later. 


af 





ime lbOteneral Predictability 
Potential predictability of a predictand/predictor 


pair is defined as: 


m n 
pP(2|1) = n/(n-l) ) p,(i)€) (p5,(3]a) -1/n) 4) 
; 1 ; Zale 
1=1 j=l 
m 
eee) BPs) 
1=1 
where: 
n ; 5 
Pee = sentinel) pee (7)i) = l/n)~ , 
: ZA 
j=l 
Pp, (1) = the marginal probability of a predictor, and 
|) = the conditional probability of the jth 


predictand, given the 1th predictor. ([4.la] 


PP(2|1) is loosely related to Shannon's definition of infor- 
mation [Preisendorfer, 1983a]. An example calculation is 
shown in Fig. 18 where EHF has a PP value of .330. To 
determine if this would be the best predictor using this 
Seacistic, compute the potential predictability for all 
predictors and rank them from highest to lowest. The 
predictor with the highest PP should be the best predictor 


For forecasting visibility using any strategy. 
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2. Class-Error Probabilities 
Zero-class (aq) and one-class (a,) Suro mesm comodo. 1= 
ties can be defined to gauge the predictive skill of a 


jeeaiction strategy. 
m 
a = 2 Pane) es 
i=l 
where: 


Ee wMargGimalwprobabillty of the predictor, 


Pp, (1) 
Jg(i) = the jgth cell in column i assigned by 
the prediction strategy, and 


enieeconaitional probability of the JQ (i). 


ee (39, (i) | i) 
ao |«( iors 


Beem Figs. 15 and 16, p, (1) = oes for all “1; J 4) = l, 


eee) t) = 2 515- i eee ae er oe and 


Po1'Jo P51'Jo 
Jy (3) = 2) P51 (5963) 13) = .788. Therefore, if EHF is the only 


meeqaictor, 


ag = ees oer (S55) 7oe) “+ (.333) (2788) = ...686 


Mies statistic ay TS eo ve Ger imition, equal to the £raction of 


correct forecasts in the dependent data set. 


fu 
| 
errs 3 


iy (EL) [Sigg (aye) ae Day ei (Slee 
al 


So 





where: 


eee) eee sCOnNditional probabilities 


P51 ‘Jo 7 
adjacent to the Pamela 2) | 2) 


values used in the a 


determination. £ 
id Jo Selethen, by Gefinition, Po, (5, (4) -1] 4) = 9 ealalileiailyy, 
if j) =n then, by definition, ne ee == (G2 a | 
Piewstatistic a, is, by definition, equal to the fraction of 


i 


forecasts for which a class 1 error has been committed. 


enim, £rom Figs. 15 and 16: 


(2 3Gepecgeil2+O) + (. 33300. 212+.0) + (.333) (.182+0) 


eH 
tl 


gts £02 


To determine which one of two or more predictors is 
the most skillful, we can plot the (Ay,4)) pairs on a skill 
diagram as in Fig. 19. The dashed lines are lines of con- 


stant class error (CE = a. + 2a.) and the more skillful 


1 
predictors will lie on the lower right part of the triangle. 
In general, the skill on the diagram decreases according to 
mijeec2i1g-Zag rule shown in the figure. If, for all predic- 
tors, ay tomeonstaiir, whitch Mayeoccur during the first 
predictor determination with a data set containing relatively 
few poor visibility cases, then the best predictor is that 


one with the greatest ag value. In this instance there is 


no need to plot the pairs. 
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Me On Coen OLClOR STATISTICS 


Once all predictand/predictor pairs have been formed 
and potential predictability and skill scores determined, 
the predictors can be ordered by decreasing predictor skill 
and by potential predictability. Fig. 20 contains the 
bivariate plot, conditional probabilities, potential pre- 
dictability and skill scores for the remaining three predic- 
Memes ifn Our artificial data set. The ordering of predictors 
is shown in Table A2. Therefore, EHF would be chosen as 
our first predictor, as illustrated on the skill diagram 
in Fig. 19. As RH, FTER and ASTD have equal ay and ay 


values, they are ranked according to decreasing potential 


predictability. 


TABLE A2 


RANKING OF PREDICTORS BY SKILL 
ive CNT Per PREDPECTABILITY 


40 a a 
este EHF -686 e202 73 30 
Orc RH .606 oOo ~22o 
Sir FTER -606 ~202 elas 
Aven ASTD .606 ~202 209 


Preisendorfer (1983b) develops statistics, similar to 
those already mentioned, for multiple predictors. The main 
conceptual difficulty of additional predictors is the 


increase of dimensions. One predictor presents a relatively 
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Simple two-dimensional problem (predictor 1 vs. predictand) ; 
two predictors present a three-dimensional problem (predictor l 
vs. predictor 2 vs. predictand); three or more predictors 
present four-dimensional and larger problems. However, with 

a little manipulation, all of the multi-dimensional problems 
greater than two-dimensions can be reduced to a two-dimensional 
problem. This is illustrated in Figs. 21 and 22 for three- 
dimensions (two predictors) and four-dimensions (three predic- 
tors). An easily programmable equation can be developed to 
create these two-dimensional arrays based upon the number of 
equally populous intervals for each predictor and upon the 
Meterval in which a particular data paae resides. 

In our continuing example, reduce the equally populous 
intervals for each predictor to an integer number (1 = l,...,m) 
with 1 corresponding to the lowest interval and m correspond- 
ing to the highest interval, as defined for the predictor 


maeex in Section II.A. Let 


1i = the interval integer number for EHF, 

j3 = the interval integer number for RH, 

kk = the interval integer number for FTER, 

mm = the interval integer number for ASTD, 

11 = the column location in the two-dimensional 


bivariate plot (equivalent to i fora 
Single predictor), 


IGPlL = the total number of intervals for EHF, 
IGP2 = the total number of intervals for RH, 

IGP3 = the total number of intervals for FTER, 
IGP4 = the total number of intervals for ASTD. 


62 





Mien, £or one oredictor, EFH: 


for two predictors, EHF and Ru: 


ll = IGP2(ii-1) + jj 


mor three predictors, EHF, RH and FTER: 


11 = IGP2(ii-1+IGP1(kk-1)) + 33 


EOmeerOur Ppredlectors, EHF, RH, FTER and ASTD: 


11 = IGP2(ii-1+IGP1(kk-1+IGP3(mm-1))) + 33 


This equation form can be expanded to accommodate any number 


@iepredictors. 


hye eUNC MONA DEPENDENCE 


After the first predictor has been selected, either from 
its skill score or potential predictability, we need a means 
to determine whether or not to add a new predictor to the 
one(s) already chosen. For this purpose, Preisendorfer 
(1983c) proposes a functional dependence index (FD) which 
describes the dependence of the new predictor being considered 


upon those already in the set of predictors. If FD is large 
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(on the scale 0 to 1) then it can be represented by predic- 
tors already chosen and its inclusion into the set of 

predictors would be redundant. However, if FD is small (on 
the scale 0 to 1) then it is likely to be a useful addition 


memcne existing collection of predictors. 


m n 
FD(2|1) = m/2(m-1) ) ) Bae) aland) = (a) Oral) 
j=1 j=1 


where: 


n-j j~-1 
wo) = >) Een res ae) + ) ee) ize) 
i Ret : 


= the sum of the conditional probabilities 
which lie in column i+l and rows greater 
than j and the conditional probabilities 
which lie in column i-l and rows less than j3 


= the sum of the conditional probabilities to 
the right and up, and to the left and down. 
The upper left (1l,n) and lower right (m,1) 
cells will always have gq values equal to Zero. 


Steel es 
mes) = =) 5 (3-K itl) + ) eee ea (ec) 
I k=1 


= the sum of the conditional probabilities 
which lie in column i+l and rows less than 3 
and the conditional probabilities which lie 
in column i-l and rows greater than j 


—eaitomoieovotioscouadit1Onal probabilities 
EOmENewTlOne andsdown, and to the left and up. 
The upper right (m,n) and lower left (1,1) 
cells will always have r values equal to zero. 
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P},(i,j) and Sey MINS |S =e JO1nt anaveconditional 
probabilities defined earlier, differing 
only in that the abscissa and ordinate are 
Meovwmeonee Crom VS ss mreadlctcor vice predictor 
Vises VISilollity. 


meee 23 illustrates the FD computation for RH given EHF. 


In this example, FD(2|1) = FD(RH|EHF) = .286. 


V. CRITICAL VALUES 


Once the various statistics have been found, a means to 
determine whether they are significant must be established. 
Preisendorfer (1983 a,b,c) proposes the use of Monte Carlo 
means, applied as follows. 

Geom the bivariate plot, as in Figs. 14, 2ib and 22b, 
we determine the marginal probabilities of the predictor 
(p, (i)) and establish incremental values from 0 to 1 (note 
that for equally populous predictor intervals, p, (i) = 1/n, 

a constant, where m = the number of intervals). We then cast 
a total of n(.,.) randomly generated numbers into the 
intervals to simulate a new data set. After each randomly 
Generated data case is cast into a column, it is placed into 
a cell using uniform probability. Fig. 24 shows the incre- 
mental values associated with the bivariate plot in Fig. 21b. 
In our continuing example we have n(.,.) = 99, so we would 
generate 99 random numbers in the unit interval. All random 


memoeers < .0/1 would be placed in column i = 1; those greater 
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Baan 207). and < .192 would be placed in column i = 2; and 
so on. As each data case is placed into a column, a single 
random number is generated to determine into which cell the 
Bee tS to be placed (e.g., a random number < .33 would be 
counted in cell (i,l); a random number greater than .33 and 
meeeeo would be counted in cell (1,2); etc.). After all 99 
cases have been cast into their appropriate cells, all of 
the statistics previously discussed would be computed and 
saved. This process would be repeated 100 times so that we 
would have an array containing 100 randomly generated poten- 
's, a,'s and FD's. These would be 


0 IL 
sorted from lowest to highest and the 96th (PP(96), a, (96), 


meal predictabilities, a 


a, (96) and FD(96)) value would determine the upper 5% critical 
value and the 5th (PP(05), ay (05), a, (05) and FD(05)) value 
Pemba determine the lower 5% critical value. For all statis- 
tics other than FD, we want values from our dependent data 

set to be greater than the upper 5% or less than the lower 

$ critical values. For FD we want values lower than the 

upper 5% critical value to ensure that our second, and subse- 
quent, predictor is not significantly dependent on the previous 


predictor(s). 


Vi ee CHOOs ENGuen yO LeTORS 


The first predictor is determined as shown in Section III. 
That is, by computing initial PP, ay and ay values for each 


predictor, ordering them by skill score and PP and choosing 
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the one with the greater skill score, or greatest PP in the 
event that all skill scores are identical. 
Subsequent predictors will be subjected to two tests; 


functional dependence and skill score. Let 


p = the number of predictors already chosen, 


Ap (k-1) and a, (k-1) = the O- and l-class errors 


of the previous stage of construction of the 
developmental model, 


k = the index of the current stage. 


Myen, for the next (kth) predictor to be accepted it should 


meet the following three conditions: 
eae FD < FD(96/i) (Giv= sl; Pp) 
(2) ay (k) > Ay (k-1) and a, (k) < . a, (k-1) 
(3) ay (kK) > ay (96) and a, (k) < a, (05) 


MmemecOndition (1) is not met but conditions (2) and (3) are, 
then a predictor may still be used, but the increase of 
predictability of the predictand will, on average, be less 
than if condition(l) had been met. However, if conditions 
(2) and (3) are not met, then the predictor should not be 
considered further. Repeat this process at all stages for 
all remaining predictors until no further predictors are 
available, then stop the construction of the developmental 


model. 
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VII. TESTING THE DEVELOPMENTAL MODEL ON INDEPENDENT DATA 


Once the model has been developed and no further predic- 
tors remain to be considered, we can test it for skills 
(aya) on an independent data set (any set whose numbers 
were not used to develop the model). This is easily accom- 
plished by sorting the independent data case values into 
predictor intervals, determined from the dependent data, and 
Galculating the location in the forecast array (ll in Figs. 
2lb and 22b) of the appropriate prediction, using the equa- 
tions established in Section III. It is to be expected that 
on average the test (a) 7a) points on the skill diagram, for 
an independent data set, will not be as skillful as on the 


set of developmental points. 
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APPENDIX B 


LINEAR REGRESSION AND THRESHOLD MODELS 


A. LINEAR REGRESSION 

in this study a least-squares, multiple linear regression 
model, known as BMDP2R in the BMDP Statistical Software 
[University of California, 1981], was used. The procedure 
used is called forward step-wise selection and picks the 
predictors (of the many offered) that have the highest 
Correlation with the predictand (visibility) based upon F-to- 
enter and F-to-remove limits, where F is a ratio which tests 
m@ewsignificance of the coefficients of the predictors in 
the regression equation. 


Pune regression model fitted to the data is 


y = at Dix) + Dox, pigaites o a) Seedy aie + € 

where: 

y = the dependent variable (predictand) which can 
be either a continuous function or a discrete 
value 

Xpress eX = the independent variables (predictors) 

Deas <a) gee = the regression coefficients 

i p 

a = the intercept 

p = the number of independent variables 

€ = the error with mean zero. 
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The predicted value y, and the general form of the resulting 


equation, is 


The step-wise selection of predictors continues until there 
are no predictors remaining which meet the F-to-enter criteria. 
The regression equation generated at each step is printed, 
along with its R-value (the correlation of the dependent 
variable y with the predicted value a and Roe The resulting 
set of equations, one for each step, are reviewed, and that 
eguation containing only those predictors which increased 

2G myeat least .0l1 is retained for application. 

The role of regression, once appropriate predictor 
variables have been selected, is eViielelhye elelsns, (oe  (lalksyelsullope! 
reduction (representing a multivariate structure by a uni- 
variate proxy which constitutes a classificatory or predictive 
index). This proxy takes the form of a polynomial, linear 
in its coefficients, of the components of the multivariate 
structure. The problem now becomes one of determining the 
form of the state conditional distributions (one for each 
mee OL interest; e.g., 1, 2 and 3 for visibility categories 
I, II and III, as used in this study). Once an appropriate 
form has been selected, it remains, then, to determine the 
parameters of the class conditional distributions (e.g., 


means and variances) and then apply the decision criteria or 


threshold model. 
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B. THRESHOLDS [LOWE, 1984a] 


i sNotation 


ig = 
(Ce 
moe = Jl} = 
mie = O| = 
ieigielg 
BErEOr 


Pie —1,E=0] 
PiC=0,£=1) 
P[C=1/E =0] 
P(C=0|E=1] 


Bee—=1),E=0) 
me O 7, E =1]) 


Lig = 


For a dichotomous problem, Z is into two parts Z 


an event; this is an indicator variable which 
when E 1, the threatening event occurs, and 
when E 0, the non-threatening event occurs. 


the classification of an unknown event which 
when C = 1, the event is classified as a 
threat, and when C = 0, the event is classified 
as a non-threat. 


MieOneonetonad probability Of Occurrence of 
threat. 


UNeCOUaEELONnalw probability Of Occurrence of 
non-threat. 


of the lst kind (false alarm) (e@=1neE=0)]. 
of the 2nd kind (miss) [(C=OnE=]1]}. 

Joint probability of an error of the lst 
= ole probabititty Of an error of the 

Zona kind. 


=e laisse ConGutional probability of misclassi— 
fying a non-threat. 


= class conditional probability of misclassi- 
Eyang a threat . 


eC Siig SO sey Soe 


P[(C=0\|E=1) P{E=0]. 


Ao emo nt houpreatetive imdex (equivalent 
to y, above). 


range of the predictive index on the real line. 


On a0 


val: 





The decision regions are mutually exclusive and exhaustive 


mse. , Zy 0 2) = 0 and Z = Zo uz,)- 


Thresholds 


boundary (s) between decision regions. 


p(z)|E =0) = class conditional density of z given 
tet ol Cee 

mez & = 1) = class conditional density of z given 
that E= 1. 

A(z) = p(z|E=1)/p(z|/E=0) = the maximum likelihood 
waIerOmeanse.. eheweratllo of class conditional 
densities). 

es = Dole pe O0j vu [€—=—Cnpb=1j)} = the total 


HLOcaoMidty Ol error . 


Zee Minimum Probability of Error Criterion 


tO 
ll 


SPEOpaotliiyvi1eof an IncOnrectaclassigication. 


Diente — Ole ole —O|E —1) plz = 1] 


II 


Pe 


where p[E=1] + p{E=0] = 1. Note that the events E = l 
and E = 0 are mutually exclusive and exhaustive. The objec- 
tive is to select decision regions (thresholds) so as to 
minimize Pe- 


ple =Oj/F=1} = |f p(z|E=1)dz = the probability of 
Dies 
0 


misclassifying E= 1. 


ae Oje—1) = f{  plzjE=i1)az + f{ p(z|E=1)dz 
Zedy Zea, 


2 | F— 1) diz 


Zeky 


fe 





ple-Off—1) = L- { p(z|E=1)dz these are 
Zeb, substituted 
Peon eS 
expression 
er 0} =) pi2z|b —0)dz Eon 
ZeEZ 
ili ‘ 
then, 
P, = PlE=0] (Peo reece —1)(ie- {  o(z2 |e =1)dz) 
Ze) Zeb, 


and algebraic rearrangement yields, 


MeeeoiE=1] - { {pfE=0] p(z|E=0) - p[E=1] p(zjE =1)}dz 


r Zee, 


In order to minimize Pe: Dae Chemadeersi1On region for C-= .1) 


1 


Memeeanclude all those values of z for which the integrand 
in the expression for Pe will be negative. The decision 


regions can be symbolically represented as follows: 
Z ee ple) p(z|/B =O) =—ple—1) p(z|E=1) > 0} 


Z = {2; plE =0] eC =O) meee = 1] piz|B=1) <0} 


An alternative representation is given by, 


Mme ta: pie =O] p(z|E=0) > p{E=1] p(z|E=1)} 


feo Ol7o1f — lees p(z)/E=1)/p(z|E =0) } 


) 





Likewise, 


a = oe ole = iomoz | = 1) /o(zif =0) } 


These statements can be combined to give, 


IL 
plIE =0)/pl[E =1] 
0 


Cc 
Pree =1)/p(z2i—==0) = A(z) 


IAv it 


Cc 


Thresholds are the value(s) of z for which 
hz = pliE =O ole = 1] 


fies Cquation can be solved for Zz either analytically or 
numerically depending on the forms of the density functions. 
3. Threshold Cases 
In order to examplify the model, the assumption is 
made that the class conditional distributions are Gaussian. 
There are essentially three distinct cases that can arise. 


a. Case I: Equal variances; different means 
(Referred to as the equal variance model in the 


text) 
2 
BizjB=1) = k exp{(-1/2) (z -u,) 7a" 
s em eee. 
p(zj|E=0) = k exp{(-1/2) (z -u,)“/o"} 
where: 
ee on 
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where Py = p{E=0] and P) = p[E=1]. 


exp{(-1/2) (z-1,)*/o7} slp, 
exp{(-1/2)(z-u,)*/o*} c50 PA 


tniuse) ene threshold 


value is 


Density 


N 
+ 
Il 


(Uo tu) 2 at a In(p)/P,) 7 (hy — Up) 


Classification index (z) 


The position of the threshold depends on the relative values 


of Py and Po- 
smallest Pp. - 


z where the densities intersect 


The threshold moves toward the group with the 


iit P) = Py the threshold will be the value of 


(1.e., where the densities 


are equal). 


b. Case II: Equal means; different variances 


opexp{(-1/2)(z-u,)*/of} cel p, 
A(z) = | —————___—-—- 2 = 
nextel 2) (zeal in/on} C=O Py 


2 





with the threshold 


Dene OG ley 2 
(55-59) est 


Mee that in this situation there are two thresholds. The 


group having the smaller variance will lie between the two 


thresholds. 


Density 
Sea 
fo 


Classification index (z) 


The thresholds shown are typical of a situation where Py < Po: 
Note that these thresholds lie between the two intersections 
of the densities. If the inequality of prior probabilities 


were reversed, the thresholds would lie outside of the 


meemon Detween the two density intersections. Further note 


that the decision region for the group having the lesser 


Variance lies between the thresholds. 
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c. Case III: General Solution (Referred to as 
the Quadratic Model in the text) 








p(z|E=1) = k/o, exp{(-1/2) (z -u4)*/o4} 
Ds 
p(z[E=0) = k/oy exp{(-1/2) (2 -u_)"/o9} 
2g o Boi es Po?) 
ne Z ) = exp{1/2 i = )~ = ( a7 ) pp Ben pon 


ey 2 


where k = (27) Algebraic manipulation produces 


ee? 2 2 2 
g)2 + 2ldguy ~F,uQ)2 


c=1 
22 22 22 
FINGT UG 7 OqHy? ~ 2099) 1m (Po) /7P) 9%) | 


IA vil 


c=l 


Whach is recognizable as a quadratic equation in z. 


ean ota 25 
where: 
a = Bebe. es 
i 0 
b ee 
7 tee 110 
_ DD ed. 
aa (OF Up Oguz) 20;Ug in (p)o,/P] 5) 
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E=0 


Density 


Classification index (z) 


The remarks given for the figures in cases I and II are also 
applicable here. More often than not, only one of a pair of 
thresholds induced by differing variances will be of real 
Mieerest. Ir the variances of the two groups are radically 
different, then both members of the threshold pair become 
im@eortant. 

In the foregoing, normal class conditional dis- 
tributions were assumed. This was done because the Gaussian 
form admits of a rather clean analytical solution. However, 
the general concept of the minimum probable error decision 
criteria may be applied to any form of density function. 
Indeed, the density function of one group need not even be 
the same form as that for another group (one might be exponen- 
tial and the other Gaussian). The difficulty with most non- 
Gaussian forms is that they seldom admit of closed analytical 


forms and require numerical means in determination of thresholds. 
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APPENDIX C 


NORTHERN HEMISPHERE PREDICTOR PARAMETERS AVAILABLE 
mem THE "NORTH PACIFIC OCEAN, JULY 1979, EXPERIMENTS 


fea: 30°-60°N; 145°E-130°W 


Model output time: OO0OO0OOGMT (TAUOO) 


A. 


Model output Descriptive name of parameters 


parameters 


Primitive equation model 


eX Surface air temperature 

EX Surface vapor pressure 

EHF Evaporative heat flux 

SEHF Sensible plus Evaporative heat flux 
THF Total heat flux 

H510 1000-500 mb thickness anomaly 
GGTHTA Surface-front location parameter 
Beni R Advective fog probability 


Mass structure model 


PS Surface pressure 

TAIR Surface air temperature 

EAIR Surface vapor pressure 

TSEA Sea surface temperature 

SSANOM Sea surface temperature anomaly 
m3)2 5 925 mb temperature 

Be25 925 mb zonal wind component 

wec5 925 mb meridional wind component 
NCLOUD Total weloudeecover 


Marine wind model 


VVWW Marine surface wind speed 


DDWW Manine Ssuretace wind direction 
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Climatological parameter 


CLIMO National Climatic Center fog 
frequency climatology 


Derived parameters 
ASTD TAIR~-TSEA 


- Surface relative humidity 
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Area: 
Model output time: 


A. 


Model output 
peramet er 


D1LO00 


D225 
D850 
D700 
Bs0 0 
D400 
D300 
D250 
TAIR 


TLOOO 


25 
T/00 
i= 0! 0 
T400 
T300 
az 50 
EAIR 


EL000 


E925 
E850 
E700 
E500 
UBLW 


ULO000 


wo2 5 


APPENDIX D 


NOGAPS PREDICTOR PARAMETERS AVAILABLE FOR THE NORTH 
ATLANTIC OCEAN, 


15 MAY-15 JULY 1983, EXPERIMENTS - 


Entire North Atlantic Ocean and Mediterranean Sea 


L200GMT (TAUOO) 


Descriptive name of parameter 


1000 mb geopotential height 


925 mb geopotential height 
850 mb geopotential height 
700 mb geopotential height 
500 mb geopotential height 
400 mb geopotential height 
300 mb geopotential height 
250 mb geopotential height 


Surface air temperature 
1000 mb temperature 


925 mb temperature 
(OCwnb>  cenpecraeure 
500 mb temperature 
400 mb temperature 
300 mb temperature 
250 mb temperature 


Surface vapor pressure 

1000 mb vapor pressure 

925 mb vapor pressure 

850 mb vapor pressure 

700 mb vapor pressure 

500 mb vapor pressure 

Boundary layer zonal wind component 
1000 mb zonal wind component 


925 mb zonal wind component 
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U850 
U700 
U500 
U400 * 
Beoo * 
Wzo0 * 
VBLW 


V1LO000 
wo Z' > 

veo 0 

V7/00 

V500 

V400 * 
yoo * 
weo0 * 
moOng 25 ** 
MORSOO ** 
PS 

SMF 

PBLD 
SLRTrO 
STRTTH 
SHF 
ENTRN 


DRAG ** 


850 mb zonal wind component 
700 mb zonal wind component 
500 mb zonal wind component 
400 mb zonal wind component 
300 mb zonal wind component 
250 mb zonal wind component 


Boundary layer meridional wind 
Component 


1000 mb meridional wind component 


925 mb meridional wind component 
850 mb meridional wind component 
700 mb meridional wind component 
500 mb meridional wind component 
400 mb meridional wind component 
300 mb meridional wind component 
250 mb meridional wind component 
See VvemeLer Ly 

Die Ommmiom ets Ge 1c ¥. 


Surface pressure 

Surface moisture flux 
Planetary boundary-layer depth 
Percent stratus frequency 
Stratus thickness 

Surface heat flux 


Entrainment at top of marine 
boundary-layer 


) 


Drag coefficient (Ch 


Derived parameters 


pTDP 
DEDP 
DUDP 
DVDP 

RH 

BM1 *** 


Vertical gradient of temperature 
Vertical gradient of vapor pressure 
Vertical gradient of zonal wind 
Vertical gradient of meridional wind 
Surface relative humidity 


Pole cet bozo x EATR) 
SNOW Soa je (0739 x T925) 
seen 9 <Eo2io) 
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BM2 


BM3 


BM4 


BM5 
BM6 


BM7/ 


** 


KK 


K*KxX 


K*K* 


*K*k* 


K*K* 


KkK* 


*K*kx* 


NO 


PUoSO2 te. eG L0 < TATR) 
Gaz OCr mean) = . 15960 x 1925) 


wOGiiooes (2117 7/1 x EALTR) 
oo eC Za ero Ome (. 19521 x E925) 


eco a (0004S x UBLW) 
(.000255 x U7/00) 


Pose Ooo x VL000) 


for ol eee OUOGIs x DEO00) 
(.0000489 x D700) 


or oO O% X PS) 
SOOO some ho): + (02642 x STRITH) 
+ (.06042 x SHF) 


rw 4 


+ NO 


| ht b& 


Parameters which were not used due to their being 
considered as having little likelihood of being 
iMooGeai= Lie rerecasting Marine visibility. 


Parameters which were not used due to loss of 
Significant digits during transfer from tape 
EOueMascowst Orage , ° 


Linear regression equation parameters. 
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APPENDIX E 


SKILL AND THREAT SCORES 





— 
Ww 
< 
O 
Lu 
oc 
© 
Lh 
OBSERVED 
Meeal = R+S+t+tTtT+U+V+w+ Xt+yYet Z 
Pl = (R+U+X) /Total P3 = (T+W+Z)/Total 
P2 = £=(S+V+Y) /Total PN = greatest of Pl, P2 or P3 
Raw scores 
AO = $% correct = (X+V+T) /Total 
Al = 1l1-class error = (U+S+Y+W) /Total 
meme = Threat score for visibility category I 
= X/(R+U+X+Y+Z) 
Mees = Threat score for visibility category II 
= V/(U+X+V+Y+W) 
fee — Threat score for visibility categories I and II 


(X+V) /(Total-T) 


TS12 is designed to represent the skill of forecasting visi- 
bility categories I and II as separate categories, rather 
than their skill as a combined category, which would be 
(U+V+X+Y) /(Total-T). 
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Adjusted scores 


AAO = (A0O-PN) /(1-PN) 
ATS1 = (TS1-P1)/(1-Pl) 
ATS2 = (TS2-P2)/(1-P2) 
ATS12 = (TS12-[P1+P2]) /(1-[P1+P2]) 
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APPENDIX F 


VISCAT 


TABLES 
TABLE I. A SUMMARY OF THE OBSERVATIONS (PERCENTAGE 

FREQUENCIES) OF THREE VISIBILITY CATEGORIES 

(VISCAT'S), FOR THE NORTH ATLANTIC OCEAN 

HOMOGENEOUS AREAS SHOWN IN FIG. 1, 15 May- 

15 JULY 1983 

NUMBER OF 
AREA OBERSERVATIONS VISCAT I Wien ale 
1 oie S 163 (.06) 436 (.16) 2126 
2 2867 aC eO)) Beri lee) 2273 
3E 1 8 (.06) Slee 92 
3W 2288 437 (.19) 284 (.12) Siow 
4 4771 2 es 3) Soe te) 4045 
SE 1087 Ss 301) 94 (.09) 984 
SW 2307 8° (.003) 40 (.02) 2259 
6N 580 one. 0 Ss) 45 (.08) Seo 
6M 2257 ae (e010) 1a. 06) 2185 
6S 60 GeO 2) Za 0'3) 7 
7 801 eo) 34 (.04) 760 
8 1284 erect) Damn) 1256 
ENTIRE NORTH ATLANTIC AND MEDITERRANEAN 
21,238 COMER >) ) 9 2038 W(210) 18,120 
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be gl 


218) 
oD 
7 70)) 
.68) 
55) 
i) 
<2 3} 
Oo) 
a9 3) 
525) 
or) 


91S) 
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TABLE II. NUMBER OF OBSERVATIONS (PERCENTAGE FREQUENCIES) 
Open eee tet tbiiy CATEGORIES (VISCAT'S), 
AND 95% CONFIDENCE INTERVALS FOR THE 
DEPENDENT AND INDEPENDENT DATA, FOR THE NORTH 
PACIFIC OCEAN AND AREA 3W OF THE NORTH 
ATLANTIC OCEAN 


Herth Pacific Ocean, July 1979 


TOTAL # OF 
Wil CAs Wako erie 1 1 VISCAT III OBSERVATIONS 

pot CI yO 7=2229 ~126-.144 .635~-.660 

Dependent data eo. irae 2 ) Boise oo) 8 2o08 (G40) 3682 
Independent data 388 (.211) PAGe 2134). 1207 (2656) 1841 
Total P20 (a2 8) (a4. 135) 9 S575 (2647) 35235 
North Atlantic Ocean area 3W, FATJUN 1983 

95% CI ee 5-207 podi=.138 Ob6—.704 

Dependent data Zot eee 4 } Poon 2i25) 1040 (2.652) 1526 
Independent data 141 (.185) 94 (.123) a | (eg 2) 762 
iota tL Ae 9 1 ) Mice 24) 1S>67 (2685) 2288 
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TABLE III. THE INITIAL FIVE BEST PREDICTORS FOR 
EPI'S OF FOUR THROUGH TEN, FOR EACH 
STRATEGY, WITH ASSOCIATED PP, ag, a] 
AND CE VALUES FROM THE NORTH PACIFIC 
OCEAN DEPENDENT DATA, JULY 1979 


Maximum-probability Natural-regression 














EPI Predictor PP ay ay CE Ao ay CE 
4 BAF oS .064 wes > 2497 491 467 ooo 
Sine SS) ~Kaore Lk pli oO oS S27 OMe sao -569 
FTER pee I -680 ese S)5 510) o 5452 468 B00 6 
CLIMO -296 Ooi aS pearl pa ee oe oe OO 

RH Pole 649 ls 5 nO 7 Poot a2 eee Oo 

5 EHF woow. woo ~~ 1eas 471 2435 Oe jee 
S EHF eoLg NOOOaL aks o ~489 7555 -400 50 
FTER 314 OW Oe eloo 2509 “209 . 396 526 

RH eel BoiG. Waals o 549 449 ZouLG 584 
CLIMO Po eecse les 49 Vase as4on eens 

6 EHF S350. oS 1835 nas 491 467 soo 
SEHF ED .690 plies 5 485 276 «475 OO 

FTER 5.8 Pow oi. 44055 woek9 POEs - 349 20 

RH 36 ns le) leyll es 5 Se 0S eee 942 

er EMO 295 oo Soo a4 / 471 <a oe oO 

7 EHF SiO ON, OOS ales ss ~479 -o 29 ans oe 
SEHF. go eD 685 pie os waa 9-5 ~DiZee = ky noo 7 

mtr R oO BOE less LS e525 aay, ora 
CLIMO owt oO 6b) 14-135 oa 5 PAS 5 ee oor ee OU Z 

RH ~ 314 fb Do BS P54 7 POU ood sc) oO 

8 EHF woo 6 MOG. 735 ~489 - a9) 5467 oo 
SEHF 20 MO ore, wks 5 me) 3 -478 .475 Pog 

FTER .320 .680 .135 .505 SSM 7 ae ey 

eit MO oO 2663 7.35 ooo 404 SG a) W625 

RH oS Ol ou, 35 Po Boo) 28 aL 2543 
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lela 
Sit 
eR 
era hMoO 
RH 


BPE 
Sig 
eek 
CLIMO 


. 340 
oe 
oes 
20 
esl 


oan 
<oOce 
2322 
oO) 
onl 


TABLE III 


Oo 
.686 
Oro 3 
Seer 
polo 


1000 
-688 
ONES 
52 
nome 


ESyS: 
oo 
eo 
yD 
23) 


sd SD 
nSS 
oS 
spl 
3 


89 


(EONT . ) 


2479 
493 
e493 
ro 
Sisk 


;473 
.489 
Oo 
ook 
sag 


eee 
fod 
.974 
~443 
-476 


sa 97 
-934 
Joo 
ais 
BUS 


a2) 
429 
nog 
e016 
~482 


467 
~401 
20916 
nao 
441 


ao L 
74 3 
,505 
wo? 
SES 


oD 
v5. 
26 
Ou 2 
aS 





Peto) a koto LAGE, CONTINGENCY TABLE STATISTICS 
AO, TS1, AAO AND ATS1 FOR BOTH DEPENDENT 
AND INDEPENDENT NORTH PACIFIC OCEAN, JULY 
oy oy mara, eOR EPT"S OF FOUR THROUGH TEN 
AND THE MAXIMUM-PROBABILITY STRATEGY, WITH 
Pitot Deol PREPETCTOR FOR EACH NUMBER 


CEesel.s 
Dependent data Independent data 
EPI AQ Sl AAO ATSL AO SL _AAQ ATS1 
4 .684 ae O02”, <comlles es MoeGr vaso) «087 gl 
5 moo 7 HOS) S650 a MO ens oe a al 
6 HOOD roe ala as Ovo 2.50 117 mee 
7 Oo 3 a0 eos nO BOOS, 260 «107 09 
8 .688 eg eae 6 ao moe goes 9210 08 
2 709 3 MOG. (Leg ce Hoos we 34 4 mG 
10 796 255) 4a 49 arly BGO Soo eld wee 
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Pewsey. EDK IG), FD RSs FE AND apg FOR STRATEGY 
MAXPROB2, NORTH PACIFIC OCEAN, JULY 1979, 
DEPENDENT DATA, FOR THOSE PREDICTORS 
SELECTED AT EACH STAGE OF THE DEVELOPMENTAL 
VODE pe USiiGen Wve Ss. PD(96) 1S COM- 
PUTED FROM 100 RANDOMLY GENERATED DATA SETS 
AS EXPLAINED IN APPENDIX A, AND PROVIDES 
A MEASURE OF HOW MUCH ADDITIONAL PREDICTA- 
BILITY MAY BE EXPECTED FROM THE INCLUSION 
OF AS NEW PREDICTOR. IDEALLY, RSS FD 
SHOULD BE LESS THAN FD (96) 
FD, of predictor added, on 

Predictor 

added Ee 0) EHF DDWW_ 4H510 Ro  RSSenD 

EHF = = ' = 7 

DDWW wooo ~ 1494 = = = ~ 1494 

510 ae oHA0 -2488 .2185 = = oo oer 

RH Bea PaO, @. 2007) 1515 = . 3666 

EE? 27-910 —oCOm edo tore .1907 24408 

CLIMO pono oe ci met eas «66. 2551 


* 

os fh D waS NOt Computed for CLIMO as the choice for 
the sixth predictor was between only CLIMO and SEHF. 
It was more economical to compute contingency table 
statistics for each and to choose the best predictor 
from those results. 


Gel 


f 


a: 
Ooms 
699 
704 
~ 1/46 
2o2U 


BOOZ 





TABLE VI. 


CONTINGENCY TABLES AND RELATED STATISTICS FOR 
BOTH DEPENDENT (3682 OBSERVATIONS) AND 
INDEPENDENT (1841 OBSERVATIONS) NORTH PACIFIC 
OCEAN, JULY 1979, DATA, FROM STAGE FOUR OF 
THE DEVELOPMENTAL MODEL. PREDICTORS ARE EHF, 
DDWW, H510 AND RH, EACH DIVIDED INTO FIVE 
EPI'S, FOR (A) MAXPROB1, (B) MAXPROB2 AND 

(C) NATURAL-REGRESSION 


(a) MAXPROB1 


DEPENDENT DATA 


FORECAST 





INDEPENDENT DATA 


FORECAST 





AOQ= .75 AAOQ= .29 

Al= .13 

TS1= .44 ANSt= «<8 

TS2= -14 ATS2= 01 

Sl = 9297 ATS12= .02 

OBSERVED 

RO. 70: AAO= ile? 

Al= > 

TS1= .34 Asi= 217 

SO, ATS2= -.06 
TS12= .28 ATS12= 7-19 


OBSERVED 


Ore 





TABLE VI (CONT.) 


(b) MAXPROB2 


DEPENDENT DATA 





AO= .75 AAOQ=  ,.29 

bp 

uv) Al= Sib 

< 

O 

lu AeSu= k= 4-7, ATS{t= <«<32 

x 

O 

u TS2= (18 ATS2= .06 
TS12= 42 ATS12= 10 

OBSERVED 
INDEPENDENT DATA 

AO= .69 AAO= 09 

-_ 

un) Al= .16 

< 

O 

TT TS1= .37 Atesi= = 20 

x 

© 

U. Ts2-) 09 AVSo-) —.05 
Teo = 2.3L ATS12= 7-99 





OBSERVED 


a5 





TABLE VI 


(CONT) 


(c) Natural-Regression 


DEPENDENT DATA 


FORECAST 


OBSERVED 


INDEPENDENT DATA 


FORECAST 


OBSERVED 
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AO= .62 
B= ee. 35 
Tod meee 7 
2 eee 
TS12= 27 
AO= .58 
Al= 35 
oi 219 
soe a7 
TS12= 99 


AAO = 


ATS1= 


ATS2= 


ATS12=-. 


AAO= 


ATS1= 


ioe 


ATS12= 


06 


206 


oS 


eS 


we 1 


OZ 


04 


[io 





TABLE VII. LINEAR-REGRESSION EQUATION FOR THE PREDICTED 
VALUE OF THE VISIBILITY CATEGORY (Y), Y 
Stowe innenio rnc! TO THE ACTUAL VISI- 
BILITY CATEGORIES (Y) AND THRESHOLD VALUES 
FROM THE EQUAL-VARIANCE ASSUMPTION MODEL, 
NOR PAGE TGsOCGEAN, JULY 1979. NOTATION 
foes o el APPENDIX 6B. 


[eo omc o + U4 eCd Eh) = 941i 2(PTER) -— .01592(RH) 


meaos COnditional distributions (1.e., distribution of y for 
a given y). 


Number of Frequency Mean Value Seameaie 
observations of Or devidElon et 
eee Oe tCéié ACD) SE 
i: 816 m2 Zag (m, ) . 348 
Z 498 we3S OS (m.) mC 2 
3 ZOOS 643 20S (m.) oo 
Ty = threshold between y = 1 and y = 2 = 2.506 
tT) = threshold between y = 2 and y = 3 = 1.768 
T, = threshold between y = 1 and y = 3 = 2.048 


Meate conditional distributions for visibility category I 
(y = 1), II (y = 2) and III (y = 3) depicting threshold 
values and means. 
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Density 


.075 


.050 


.025 


~O 


fAabee vet “(CONT .) 


- 
I 
> 
M4 Mo 
Tg 
13 
, 
x 
4 ews "My 
\ 7 = 
re a ~S 
we eo eee Y 
2 e ade 
. 5 VY 
me eS2 
1.5 0 as 310 3.5 


Predicted value (9) 
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Abie Vib. 


CONTINGENCY TABLES AND RELATED STATISTICS 
FROM LINEAR REGRESSION, FOR BOTH DEPENDENT 
(3682 OBSERVATIONS) AND INDEPENDENT (1841 
OBSERVATIONS) NORTH PACIFIC OCEAN, JULY 


oo Oak 


DEPENDENT DATA 


FORECAST 





INDEPENDENT DATA 


FORECAST 





AO= .69 
ae ee eA 
hsl=— 35 
S200 

owe 2 

OBSERVED 

AO= 69 
Ae 1 3 
Ela 34 
25 0). 0 

TS12= .26 


OBSERVED 


oF 


AAO= -14 
ATS1= .17 
ATS2=-.-16 
AtSi2=— 12 
AAO= relat 
ATS1= .16 
ATS2= -.15 
ATS12= -.13 





ote linia PTV BEST PREDICTORS FOR EPI'S 
OPS cOUR Minn OUCGH TN, FOR EACH STRATEGY, 
WITH SASSOCTATED PP; an, a, AND CE VALUES 
FROM THE NORTH ATLANTIC OCEAN AREA 3W 
Dosen Orewa MAY-15 JULY 1983, 
WITHOUT LINEAR-REGRESSION EQUATIONS AS 





Pree beLons 
Maximum-probability Natural-regression 

EPI Predictor PP 20 =a CE “9 ol CE 
4 E850 soe. eee «125 2482 coeds PecaaGs 2526 
SHF POiome oon & ads. 224 9:3 S52 ees 452 1: 

DTDP Poa 26cm. 2.125. -75905 soll) [sed .474 

Eo25 to 20S ez > 905 oOo wetos, fe5o7 

SMF Yooe 2682) Zo? Jecok 1 yon mes O le 437 

5 E925 Roew se 20 sales. 2472 64 ooo a, 294 
E850 moose POU) as LZow 2475 Poe Bore Nae 

DTDP moe4 5099 2.25 2477 eon ge 209 oe 

SHF SS Se) ILE ee, 6 Foc. 2463 

SMF oe oe>6  si25 7.503 so26 2409 530 

6 DTDP Nooo ee 5O. lao, 3.456 JD Oe PesioOn wr 0o 
E850 Pore OIF el.) 2477 5009 «..324) ):. 4538 

SMF moor 699) =ade2a); 2477 25054) Yesode Oe o4 

e925 POS OFS-, .akZo 485 OJ 5n wooo a6 

SHF ovo OoS L255. 469 GDL 2) @r oe © oie oie). 

i DTDP wosO eG ~.125 7.2443 olan 2429 6. 54:2 
SMF Poa) een Sots 6246 3 PoOe Been f495 

E850 Ro oo ero eee 47 7] SOI Geog 1.489 

Bo 2 5 Pore eoo2  2b25 >. 49): -547 .400 .506 

SHF momo 469) 25. 493 548 .407 .497 

8 SMF yooe «/14) ~125%..448 Foo es | 328 
DTDP wooed 2. 25> ee 45 1 ~-611 .304 .474 

E850 Owioee OO. aio. wad 75 focus woos .469 

Ser woe oo 2S) 3493 Sea | (LoL 

BY 25 PoCeemeooo | 2125. .505 Foe OU. 266 


2:6 





10 


SMF 
PiDP 
SHE 
E350 
EoD 


SMF 
bipr 
E925 
Eo > 0 
oHF 


TABLE IX 
ome «14 
Toes OS 
oe ee 700 
2370 (0ol9 
sO f «O29 
eee ee ee, 
304, ~ 2 7an0 
yoGg Nw 702 
oO). 700 
wooue! 4.698 


vio 
eZ 
p25 
yR2 5 
ZS 


Sh 2> 
eal > 
ez 
lec 
lez oS 


(CONT .) 


oo 


-448 
2459 
a gs) 
477 
477 


437 
aie o 
471 
ge 72 
aD 


. 0S 
208 
041 
a0 
Ore 


ROG 
iow 
-964 
6 
a SIR 


OU) 
. 360 
ray 
-402 
414 


.409 
oa 
noo 
nS hO 
eoIOnS 


os 
.904 
mole al 
.498 
2 


slag 
497 
FAIS 
.478 
-483 





Pipe FP inhotl=otAGE CONTINGENCY TABLE STATISTICS AQ, 
TS1, AAO AND ATS1L FOR BOTH DEPENDENT AND 
INDEPENDENT NORTH ATLANTIC OCEAN AREA 3W, 
as inv soo, DATA, FOR EPI*S OF FOUR 
THROUGH TEN AND THE MAXIMUM-PROBABILITY 
STRATEGY, WITHOUT LINEAR-REGRESSION EQUATIONS 
AS PREbLTerors 





Dependent Independent 
Bese 
Mee Predictor AQ TSl AAO ATSIL AO TSl AAO ATS1 
4 E850 ste. @eusoe. «05 fae 209 Mee 0 = 01 14 
=) OI? 5 p70 2505 306 agile ee OE a Ol 14 
6 DTDP Se We oir) 29 wie five? 86305 calles 
i DTDP ee oe Sele ed Lee ee ae 
8 SMF, SUP Oe te 0 le 5 3: hh ee aes 
9 SMF Spe comb LO mols oo) 3 ee Ore tee 09 
‘10 SMF Wise eeciow 309 OS wio VaZe “als 06 


100 








eee be  OOME AS TABLE TA, EXCEPT WITH LINEAR- 
REGS ON BOUAEEONS AS PREDICTORS 




















Maximum-probability Natural-regression 
Bee Predictor PP “0 “1 CE aces CE 
4 BML aes oO. gl Zzo. 6370 woo waece Be. 504 
BM3 ey ea «pele. 2 O92 meco wee /0 2400 
BM2 Reo seo ele) 450 Peo Meo Vago LZ 
BM7 woe eno: Woes: ~465 oes. Meo «315 
Be 50 Soc eo! seal 4 82 Told Wee o. 2526 
5 BM1 Meroe woe. ele. B37 / Poo ame oO wena 4 7 
BM3 os, edo Bale >- 2377 so90 3374  .446 
BM2 eo 00 (2 / “slZ5 9.421 poo  wroos "462 
BM7 Sooo 2G wa lZay .444 7904-393 ..480 
Eo 2 moor Ore eee oe | 24 72 2504 4 2379.) 2494 
6 BM1 ooo galzae 6372 POG Meso 2,, «4s 
BM3 PaaS e eeoo 225255 1.383 O25. “seo. 2422 
BM7 eos wee alge. ~425 SOOR 2838) 14453 
BM2 Moe 2725 ..2g8257) 2429 ,o7 2454 .512 
DTDP omer ses 00 size 324.56 Bo CSO oO S 
7 BM1 Mo bea 9 eZ oe SSS iG) es, 0 3 ood 
BM3 foe Ae oO) Co Z25e 5.394 2 DO oo eee ore 
BM2 MOO One 2720 02S, 62419 -554 .406 .486 
BM7 WA04, 2721. ~125) 9.434 wa 0F Pe SOS ee ooo 
DTDP moe: sf bG 2.1 Zs. Med 43 ,5914 -2429 . 1.542 
8 BM1 MeO Aero. eo ® > 7/0 5O0 67 . eo ooy eo L 
BM3 ore ae 4 2 wee om 392 O01" mao 0 -h.440 
BM2 SOD 24 Meow (6a 2 7 ASG Dv, eee.” V46G 
BM7 OOp hoo eco, « 429 SOOO S, ..c4 FZ 
SMF O52. eee La; belize) 4 448 HIS OCT on OZ 


OnE 
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<kZ> 
aS 


OZ 


oc 
DOG 
~430 
434 
.448 


Sh 
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~421 
~433 
.438 


oie 
OI. 
-540 
2547 
OS 
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.647 
5016 
.964 
226 


.250 
248 
.427 
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360 
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wou 
.409 


aoe 
ooHn 
~493 
-491 
.514 


230 0 
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471 
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Pabst lo ooME AS TABLE X, EXCEPT WITH LINEAR- 
REGRESSION EQUATIONS AS PREDICTORS AND 
Deo eter ere roR POR EACH NUMBER 


OF EPI'S 
Dependent Independent 
EPI AQ  TSL ARQ aTSL aO Sl AAO ArTSl 
4 7 5 45 a2 ee. ~74 43 ey 230 
5 LS: ~42 zal ZS i 41 ly pee 
6 ares 41 mee me 5 . 40 ES w2G 
i nO ely 0 ae ho nog al 545 
8 eS uae Boe coe 74 43 olny 20 
9 iO 44 Be 5 Sil 75 ~42 pales eo 
Ie 2 42 Bz 22S iS 41 5 dy eS 
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TABLE XIV. FD(96), FD, RSS FD AND ag FOR STRATEGY 

MAXPROB2, NORTH ATLANTIC OCEAN AREA 3W, 15 

MAY-15 JULY 1983, DEPENDENT DATA, WITHOUT 

LINEAR-REGRESSION EQUATIONS AS PREDICTORS, 

POR Blo were nReorelORS SELECTED AT BACH STAGE 

OF THE DEVELOPMENTAL MODEL USING FIVE EPI'S. 

FD(96) IS COMPUTED FROM 100 RANDOMLY GENERATED 

DATA SETS, AS EXPLAINED IN APPENDIX A, AND 

PROVIDES A MEASURE OF HOW MUCH ADDITIONAL 

PRE DEGAS TI ty MAY BE EXPECTED FROM THE 

TNCEUSTON OF A NEW PREDICTOR: IDEALLY, RSS 

ED SHOULD BE VLESS THAN FD(96). 

BD Or mmeGrebon added, on 
emeadictor 

Added FD(96) Boe U7 oe DVDP Shor Ove NLR J RSS -F D 0 
m2 5 = = - — - - - a702 
U700 pelea dG a oelet) = = - = ieee 706 
DVDP ~2147 L581 .1494 - - - e275 21a 
STRTFQ eo 2 woo ~~wliW04, 7 l427 - - 2844 ole 
ENTRN 0 6 “noe slooG 21734 eS 7 = olay ao ie 
PS . 3394 igo) 7. Lio eee 2 Som) 51. 40'5 3887 950 
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eos Ay it, 


CONTINGENCY TABLES AND RELATED STATISTICS FOR 
BOTH DEPENDENT (1526 OBSERVATIONS) AND INDE- 
PENDENT (762 OBSERVATIONS) NORTH ATLANTIC 
CGeAlimAkoaw ow lo) MAY-15 JULY 1983, DATA, 
WITHOUT LINEAR-REGRESSION EQUATIONS AS 
PREDICTORS, FROM STAGE FIVE OF THE DEVELOP- 
MENTAL MODEL. PREDICTORS ARE SMF, D850, 

RH, UBLW AND ENTRN, EACH DIVIDED INTO EIGHT 
EPI'S, FOR (a) MAXPROBI, (b) MAXPROB2 AND 

(c) NATURAL-REGRESSION 


(a) MAXPROBIL 


DEPENDENT DATA 


FORECAST 





INDEPENDENT DATA 


FORECAST 





AO= ,98 AAO= .95 
Al= .01 

TS1= .95 ATS1i= -94 
7TSs2-. 91 mein 
TSi2= -995 ATS12= °°? 

OBSERVED 

AO=  .70 AAO= 04 
Al= EG 

TS1= .34 AGTS1= .19 


ioc 215 AgS2- ..03 


on es 7 AToio=—.05 


OBSERVED 
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TALE <vai (CONT.) 


(b) MAXPROB2 


DEPENDENT DATA 


FORECAST 


GBSERVED 


INDEPENDENT DATA 


FORECAST 


OG5E2R VED 
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TABLE AVIL 
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(c) Natural-Regression 


DEPENDENT DATA 


FORECAST 


OBSERVED 


INDEPENDENT DATA 


FORECAST 


OBSERVED 
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TABLE XVIII. SAME AS TABLE XVII, EXCEPT FOR FIVE 
EPI'S. PREDICTORS ARE E925, U/00, DVDP, 
STRTFQ AND ENTRN 


(a) MAXPROBL 


DEPENDENT DATA 


AO= .92 AAO= 4.74 
~ 
Ww Al= eO.5 
<q 
O 
Lu mou=- sf? ATSi= .71 
o 
eS ae 
ie TS2=- -63 ATS2= -? 
PSi2= 795 ATS12= .63 





OBSERVED 


INDEPENDENT DATA 


AO= AAO= .09 
" pf . 
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o 19 20 Dai 
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cr 
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% TS2= .14 ATS2= .02 
1 14 
TS12= 99 ATS12=_ 92 
1 2 3 


OBSERVED 
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Tae elt kh .CONT.) 


(b) MAXPROB2 


DEPENDENT DATA 


FORECAST 


SBSER Vee Dp 


INDEPENDENT DATA 


FORECAST 


OBSERVED 
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aphex is. (CONT .) 


(c) Natural-Regression 


DEPENDENT DATA 


FORECAST 


GSBSERV ED 


INDEPENDENT DATA 


FORECAST 


OBSERVED 
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TABLE XIX. CONTINGENCY TABLES AND RELATED STATISTICS 
FOR BOTH DEPENDENT (1526 OBSERVATIONS) AND 
INDEPENDENT (762 OBSERVATIONS) NORTH ATLANTIC 
OCHA eink won to MAY-15> JULY 1983, DATA, 
Viel ohoReiGhwos LON EQUATIONS AS PREDICTORS, 
PROMeoa2GEFOURSOE THE DEVELOPMENTAL MODEL. 
PREDICTORS ARE BM1, U850, D500 AND V850, 
PACH De TPE DIINO FOUR BPI'’S, FOR (a) MAXPROBI, 
(b) MAXPROB2 AND (c) NATURAL-REGRESSION 


(a) MAXPROBI 


DEPENDENT DATA 





AO= .79 AAO=— 34 
= 
” Al= .12 
<q 
O 
LU TS1i= , 50 ATS1= 
a oo 7 
O 
= TS2= (10 ATS2=- 02 
TS12= (40 ATS12= .12 
OBSERVED 
INDEPENDENT DATA 
= A = 
9) ee ae 
a TS1= .51 ATS1= .40 
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TABLE XIX (CONT.) 


(b) MAXPROB2 


DEPENDENT DATA 





or=— seo AAO= .34 
= 
Wv) Al= Saaz. 
= § 
O 
uw TS1= .51 ATSi=9 46 
o 
O 
uL TS2= .10 ATS2=-.02 
nSo12= 42 ATS12= .16 
1 Z 3 
OSSERVED 
INDEPENDENT DATA 
AO= 7G AAO= aoa 
— 
” Al= 2 
<x 
O 
uu note 5k ATSi=  -40 
~ 
O 
LL WS = jee 0S MGiGO= —.05 


TS1I2= (39 A@oste— 12 
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TABi hex (CONT =) 


ae) Natural-Regression 


DEPENDENT DATA 


FORECAST 


OBSERVED 


INDEPENDENT DATA 


FORECAST 


OBSERVED 
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fp iE Ax. 


SAME AS TABLE XIX, EXCEPT RESULTS ARE FROM 


STAGE TWO IN THE DEVELOPMENTAL MODEL AND 


Peeler e nee PmyiorD INTO EEGHT EPI*S 
EACH. PREDICTORS ARE BM1 AND U500 


(a) MAXPROB1 


DEPENDENT DATA 


FORECAST 





TS1= 
TS2= 


inode" 


1 2 3 
OBSERVED 


INDEPENDENT DATA 


FORECAST 
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aah 
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56 13 484 
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: 2) Se 


TS2=0. 
eS 
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1 2 3 
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DAS Geen CONT. ) 


(b) MAXPROB2 


DEPENDENT DATA 


FORECAST 


OBSERVED 


meePENDENT DATA 


FORECAST 
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TABLE XX (CONT.) 


(c) Natural-Regression 


DEPENDENT DATA 


FORECAST 


O Boe Rh VED 


INDEPENDENT DATA 


FORECAST 


OBSERVED 
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AO= .64 
a eS | 
TS1= 93 
TS2= jo 
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TABLE 


ool 


LINEAR-REGRESSION EQUATIONS FOR THE PREDICTED 
VALUE OF THE VISIBILITY CATEGORY (Y), FOR BOTH 
FECES. PONS UE THOS, Y STATISTICS WITH RESPECT 
TOS clus VEiSTBTEITY CATEGORIES (Y) AND 
it bowie a aUmoer ROM BOTH THRESHOLD MODELS, 
NORTH ATLANTIC OCEAN AREA 3W, 15 MAY-15 JULY 
Joo lA rote tS AS IN APPENDIX 5 


Pee DeLinitions: 


ieee = 


LR2 -~ 


B. LRla 


i 


Linear regression method 1: single equation, 


three visibility categories 
Linear regression method 2: Decision-tree; two 
equations, two visibility categories each 


All predictors were made available to the 
regression model. 


Only the best predictors from the Preisendorfer 
(1983 a,b,c) methodology were made available 
to the regression model. 


Quadratic threshold model (Case III, Appendix B) 


Equal variance threshold model (Case I, Appendix B) 


Me 2 eo Oo (mA) = P00 237(E850) = .07319(7T925) 


See oO (E925) 


miss Conditional distributions (i.e., the distribution of y 


fOr a given y). 


Number of Frequency Mean value Standard 
observations Cue ot deviation of 
y of y y_(p) ¥_(m) y_ (0) 
1 29.6 .194 oO 4 (m, ) 2434 
2 oO aL 2.324 (ma) oo 
5 1040 2O82 Da BWA (m.) A ay 


LEAS, 





Density 


Tipe r (CONT, } 


LRLaA 

Ty = threshold between y = 1 and y = 2 = 2.275 
T. = threshold between y = 2 and y = 3 = 1.839 
T = threshold between y = 1 and y = 3 = 2.008 


(second threshold value, of the pair, was of no interest. 
See Appendix B) 


LR1aB 

My = threshold between y = 1 and y = 2 = 2.368 
Th = threshold between y = 2 and y = 3 = 1.768 
De = threshold between y = 1 and y = 3 = 2.060 


meee Conditional distributions for visibility category 
fmero = 1), IT (y = 2) and Til (y = 3) depicting 
threshold values and means. 
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TABER xXXi- (CONT. ) 


@.. LR2a 


Beem) = o0s05del22(BATR) + ,11284 x10 (D850) 


=O sosoKrooO) = wags (T1925) 


@ass conditiona distributions 


Number of Frequency Mean value Standard 
observations Ou Ot ey lave on 
fee Cc ( tk LLCs tm)  —téisiEC I) 
0 486 O46 .479 (mM) ae 
1 1040 oe ao 6 (m, ) 7209 
LR2aA: Ty = threshold between y = 0 and y =1= 
LR2aB: T. = threshold between y = 0 andy =1 = 


meaee Conditional distributions for combined 7s 2 aide 
Categories I and II (y = 0) and visibility category IIIf 
(y = 1) depicting threshold values and means 
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Predicted value (y) 
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Bedation 2: 


@lass Conditional 


Number of 
observations 
GE Y 


I< 
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OO 


fe 


LR2aA: 1 


mazabs TT 
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Stace conditional 
(y Mmana Li (y 


075 


Density 
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LAB noe (| CONT.) 
y = .01229 - .18917 x 10°(U1000) 
~ .02088(T500) + .1339 x10 °~(U500) 
+ .15259 x 1077 (D925) - .32705 x 107*(STRTFO) 
+ 7.50153(DEDP) - .03279(DVDP) 
distributions 
Frequency Mean value Same aird 
Cis of deviation 
vy (p) y_(m) Om ype) 
.609 319 (my) miso 
Soo 203 (m, ) .194 
threshold between y = 0 and y = 1 = .5102 
threshold between y = 0 and y = 1 = .4972 
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Predicted value (4) 
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abe 


ategory 


depicting threshold values and means. 





TABLE XXI (CONT.) 


ie) LR2b 
Equation 1: y = .89952 - .04830(E850) + .02472(SHF) 


+ 2.17081(DTDP) + 6.81684 (DEDP) 


Sass conditional distributions 


Number of Frequency Mean value Standard 
observations at of deviation 
ee mCi 9) 
0 486 eos ~496 (My) ere 
iE 1040 Oe oo (m,) Oh 
LR2bA: T, = threshold between y = 0 and y = 1 = .4922 
mRZDE: tT. = threshold between y = 0 and y = 1 = .5119 


State conditional distributions for visibility categories 
Meanad Lt (y = 0) and visibility category III (y = 1) 
depicting threshold values and means. 
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Predicted valus (9) 
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ae eheE (CONT. ) 


Equation 2: y = .71769 + .11439 x10 °(v700) - .47810 x10~*(sTRTFOQ) 


+ 4.5433 (DTDP) 


mlass conditional distributions 


Number of Frequency Mean value Standard 
observations of of deviation 

y Or Y Vv ot) y__(m) of y (0) 

0 296 oOo woo (m,) . 164 

iL Foo ool .476 (m, ) oars 
LR2bA: Ty = threshold between y = 0 and y = 1 = .5208 
LRabB: T = threshold between y = 0 andy = 1 = .4978 


Sears Conditional distributions for visibility category I 
(y = 0) and II (y = 1) depicting threshold values and means. 
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TABLE 


eee. 


CONTINGENCY TABLES AND RELATED STATISTICS 
FROM LINEAR REGRESSION METHOD 1 (SINGLE 
EQUATION) , QUADRATIC THRESHOLD MODEL, FOR 
BOTH DEPENDENT (1526 OBSERVATIONS) AND 
INDEPENDENT (762 OBSERVATIONS) NORTH 
ATLANTIC OCEAN AREA 3W, 15 MAY-15 JULY 1983, 
DATA, WITH ALL PREDICTORS AVAILABLE TO THE 
REGRESSION MODEL 


LRlaA (Table XXI) 


BEPENDENT DATA 


FORECAST 





AO= .75 AAO= .21 

Al= .12 

Hoi=)s5 50 NESt= «23 

T$2-0.0 MSO 

T3112 = 27 ATS12=-.07 


CBSery ED 


faecreNDENT DATA 


= A = 
Al= eal 


FORECAST 


TS2=0.0 ATS2= -.14 


TS12= -27 ATS10=—-09 


OBSERVED 


126 





PAs <ALil. 


SAME AS TABLE XXII, 


Bx Gnel UsSine THE 


EQUAL-VARIANCE THRESHOLD MODEL 


LRlaB (Table XXI) 


DEPENDENT DATA 


FORECAST 


OBSERVED 


INDEPENDENT DATA 


FORECAST 


OBSERVED 








257 


nS 


TS2= 0. 


TS12= 


TS1= 


TS2= 


TS12= 


OF 


See, 


ale 


~41 


as0 


Sa 


ee 


40 


mane 


AAO= 22 
ATS1= .2/ 
ATS2=_ 14 


ATS12=~_ 93 


AAO= 17 
AUST as 26 
ATS2= -.14 
ATS12=-.94 





TABLE XXIV. CONTINGENCY TABLES AND RELATED STATISTICS 
FROM LINEAR REGRESSION METHOD 2 (DECISION- 
TREE), QUADRATIC THRESHOLD MODEL, FOR BOTH 
DEPENDENT (1526 OBSERVATIONS) AND INDEPENDENT 
(762 OBSERVATIONS) NORTH ATLANTIC OCEAN AREA 
CVA > DULY 1983, DATA, WITH ALL 
PREDICTORS AVAILABLE TO THE REGRESSION MODEL 


LR2aA (Table XXI) 


DEPENDENT DATA 





AO= .76 AAO= 23 
— 
o a Le 
< 
O 
ud TS1= .43 hnSst= .30 
a 
O 
ee TS2= .13 ATS2= -00 
TS12= -36 ATS12= -06 
1 a 3 
OBSERVED 
INDEPENDENT DATA 
AO= .73 AAO= 14 
— 
nn Al= .14 
< 
O 
= ion= 62356 ATS1= .24 
O 
UL. TS2= .07 ATS2= -.06 


TS12= .30 ATS12=-.-0l 





OBSERVED 
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Map. CAME AS) TABLE AXEV, EXCEPT USING THE EQUAL- 
VARIANCE THRESHOLD MODEL 


LR2aB (Table XXI) 


DEPENDENT DATA 


AO = awa AAO= .23 
~ 
Wu) Al= dls 
< 
O 
Ww TS1= .44 ATPSt= “ool 
x 
= ier OL 
UL Ts2= - ATS2= ° 





TS12= .37 ATS12=.07 
1 2 3 
OBSERVED 


INDEPENDENT DATA 


AO= oS AAO= paca 
— 
”) Al= eld 
< 
O 
~ TS1= 6 ATS1= .24 
~ 
5 
LL TS2= .08 ATS2= -.05 





TSI2= 139) ©6 ATS12=- 01 


1 2 3 
OBSERVED 


LAS) 





eee i) CON UENGENCY TABLES AND RELATED STATISTICS FROM 
LINEAR REGRESSION METHOD 2 (DECISION-TREE), 
QUADRATIC THRESHOLD MODEL, FOR BOTH DEPENDENT 
(1526 OBSERVATIONS) AND INDEPENDENT (762 
OBSERVATIONS) NORTH ATLANTIC OCEAN AREA 3W, 
Poem Oy 1983, DATA, WLTH ONLY THOSE 
PRE ReLOho EbENTIFTIED AS BEST BY THE 
PREISENDORFER METHODOLOGY AVAILABLE TO THE 
REGRESSION MODEL 


LR2bA (Table XXI) 


DEPENDENT DATA 


AQ = cp AAO = Pye 
b~ 
Y) A1= sss 
< 
O 
= ot 4 ArySad= 627 
O : 
LL mS52-" 205 ATS2= -.09 





Tsit2= .32 ATS12= .01 
1 2 3 
OBSERVED 


INDEPENDENT DATA 


AO= 173, AAO= aA 
b~ 
YY) A1= 
< m4 
O 
= TS1= 40 ATS1= a6 
O 
- TS2= (01 ATS2= ~,13 
iSi2=, .29 Anse] =. 02 





OBSERVED 
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TABLE XXVII. SAME AS TABLE XXVI, EXCEPT USING THE 
EQUAL-VARIANCE THRESHOLD MODEL 


LR2bB (Table XxXI) 


DEPENDENT DATA 





AO= ,74 AAO= 19 

= 

" Al= .14 

<q 

O 

Lu TSt= .42 ATS1= os 

fe 

© 

Le r52= .06 ATS2= = 07 
TS12=-33 ATS12= Oz 

1 2 3 
OBSERVED 
INDEPENDENT DATA 

AO= ,73 AAOQ= ye balk 

- 

” Al= 14 

<q 
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uJ TS1= .40 ATS1= .26 

‘ae 
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LL TS2= .03 ATS2= -.1l 
TS12= .30 AESIO=) =. 02 
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Fig. 4. The behavior of functional dependence (FD) as 
determined from 100 randomly generated data sets 
(Preisendorfer, 1983c) for EPI's of two through ten 
for (a) the North Atlantic Ocean area 3W, 15 May- 

15 July 1983, dependent data (1526 observations) 

and (b) the North Pacific Ocean, July 1979, dependent 
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(upper dashed), FD(05), (lower dashed), mean FD 
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